Skip to content
  • About

    Frame 16

    CONNECT WITH US

                

     

     
    Our Story

    Discover our journey from inception to today and our vision for tomorrow.

    Our Experts

    Connect with our industry experts driving innovation in today’s technology landscape.

     
    FCAT in the News

    See how we’re making waves and shaping conversations across media platforms.

  • Focus Areas
    Exploring our Impact

    Pioneering tomorrow’s breakthroughs with cutting-edge research, emerging technologies, and next-gen talent development.

    Artificial Intelligence

    Advancing intelligent systems and machine learning solutions.

    Emerging Technology

    Exploring next-generation innovations and applications.

    Innovation Ecosystem

    Creating collaborative networks that accelerate breakthrough development.

    Accelerated and Quantum Computing

    Pioneering the next evolution of computing paradigms.

    Quantum Security

    Pioneering the next evolution of computing paradigms.

    University Partnerships

    Connecting academia with industry through co-ops and internships.​​​​​​​​​​​​​​​​

    Research

    Conducting cutting-edge studies across multiple technology domains.

    Blockchain

    Building secure, decentralized technology frameworks.

  • Products and Programs

    Frame 16 copy

    CONNECT WITH US

                

     

     

    FCAT Wallet

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidun

    Sherlock Suite

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidun

     

    Fellowship

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidun

    University Awards

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidun

  • Insights
    What You'll Discover:

    Your gateway to cutting-edge research, expert insights, and industry intelligence.

    Dive deeper into the data that drives decisions. Our Observations blog connects you with the research, conversations, and insights that matter most in today’s rapidly evolving landscape.

    Stay ahead of the curve. The next big insight is just a step away.

     

    Ask FCAT

    Exclusive research findings and methodologies from our research team.

    FCAT Conversations

    Candid interviews with industry leaders and innovators.

    Expert Insights

    Actionable intelligence and practical tips form our specialists.

    Group 287

    Join industry trailblazers as they decode crypto news, dissect market trends, and explore the future of digital assets to prepare you for what’s next.

     

  • Work with Us

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit

    Pilots and Proof of Concepts

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidun

    Join the Team

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidun

Search icon
All posts

Ask an FCAT Researcher: David Bracken on Synthetic Data

FCAT researcher David Bracken focuses on New Business Foundations. He digs into the newest ideas that companies are leveraging to grow revenue and has a special interest in emerging technologies, including the ways in which customers will use them. Through his work, he has researched everything from the impact of memes on our culture to blockchain technologies and social ties in the digital age.

Lately, he has been exploring the opportunities and challenges surrounding synthetic data � a version of existing data that has been altered to remove private and/or personally identifying information.

Q: Why is synthetic data a hot topic right now?

A: The foundational generative AI models currently in-market have largely been trained by the enormous amount of data that companies have scraped off the internet. Now, they are running out of new data to use, which has led to increasing experimentation with synthetic data to solve some of these data scarcity issues.

Synthetic data is not new. Autonomous-driving companies have been using it for some time, and interest also picked up significantly when more stringent privacy laws were passed in Europe about six years ago. Companies began looking into whether synthetic data could help them get around some of these regulations, but generative AI has triggered a new, growing wave of interest in the technology.

Q: Who is most interested in using synthetic data?

A: One of the main reasons that synthetic data is attractive � particularly to companies that are heavily regulated � is that some standard ways of scrubbing data can be reverse-engineered. They are not foolproof. So, organizations are interested in finding better approaches to strip out identifying factors, but in such a way that the data remains valuable for their purposes.

Synthetic data vendors can create new, fully anonymous datasets by training models on the statistical properties of the data without having them memorize any personal information.

Q: Once they have the synthetic data, how do they apply it?

A: There are a lot of different use cases to consider. It can be implemented in places where companies aren�t able to use traditional data or there isn�t enough data to do what they need to do. Fighting fraud is a good example. Organizations might not have much data on a particular kind of fraud, but they want to be able to train their models so that it can be automatically identified in their systems. So, one option is to create synthetic data that looks like the fraud they are trying to catch, which will help their models get better at uncovering potentially fraudulent activity.

Synthetic data can also be used for customer acquisition and onboarding, as well as software testing. Firms are exploring whether they can use synthetic data to help get their software to market faster, since it can expedite access to the production data software engineers need to move projects forward.

Q: Are there examples where synthetic data is not just faster but better?

A: A lot of traditional datasets are problematic because they are not representative of society or the marketplace for a product. This can lead to biased analysis and decision-making. Synthetic data from underrepresented groups can be implemented to help correct for imbalances.

Q: Where can synthetic data use go wrong?

A: Researchers have started to explore what happens when large language models are trained on significant amounts of synthetic data. Some of this research has found that synthetic data can cause the models to rapidly deteriorate � often referred to as model collapse.

Others have been exploring whether synthetic data can introduce more bias into AI models and if it might complicate our understanding and interpretations surrounding generative AI decision-making.

Q: More companies may be experimenting with synthetic data, but how common is it in the overall data market?

A: Most of the forecasts currently peg this market at about $300M annually, and there are a few reasons for this. The main one is that there just aren�t a lot of standards around creating synthetic data. It�s hard to gauge the accuracy of the data being produced. The vendors that help companies create synthetic data each have their own way of evaluating quality, and as standards improve, I think we can expect this market to grow significantly.

1176567.1.0