Ryan Kolln is the Chief Executive Officer and Managing Director of Appen. Ryan brings over 20 years of global experience in technology and telecommunications, along with a deep understanding of Appen’s business and the AI industry.
His professional career began as an engineer, with a focus on mobile network data engineering in Australia, Asia and North America. On completion of an MBA from New York University, Ryan joined The Boston Consulting Group (BCG) in 2011 as a strategy consultant. During his time at BCG he specialised in technology and telecommunications and gained deep strategy expertise across a variety of growth and operational topics.
Joining Appen AI in 2018 as VP of Corporate Development, he led strategic acquisitions like Figure Eight and Quadrant, and supported the establishment of the China and Federal divisions. Prior to his appointment as CEO, he served as Chief Operating Officer, overseeing global operations and strategy.
With over 20 years of experience in technology and telecommunications, how has your career path shaped your approach to leading Appen through the rapidly evolving AI landscape?
My career began as a telecommunications engineer, where my role was to build and optimize networks and involved a huge amount of data, analytics, and finding innovative solutions to optimize network performance and customer experience.
After completing my MBA at NYU, this evolved into leadership roles in tech strategy and mergers & acquisitions, where I focused on bigger strategic questions, such as emerging trends, investment opportunities, and business models. This background has given me a deep understanding of both the technical and business aspects of emerging technologies.
At Appen, we work at the intersection of AI and data, and my experience has allowed me to lead the company and navigate complexities in the rapidly evolving AI space, moving through major developments like voice recognition, NLP, recommendation systems, and now generative AI. This strategic vision is crucial as AI continues to transform industries globally.
You’ve been with Appen since 2018, driving major acquisitions like Figure Eight and Quadrant. How have these strategic moves positioned Appen as a leader in AI data services, and what do you see as the next big opportunity for the company?
The acquisitions of Figure Eight and Quadrant were key to expanding our AI data capabilities, particularly in areas like data annotation and geolocation intelligence. Figure Eight’s data annotation platform was particularly impactful. The platform is highly customizable, and we have used it for work in many different domains. More recently, we have been utilizing the platform to run most of our generative AI dataflows.
In addition to the acquisitions, about 5 years ago we set up an operation in China called Appen China. We are now the largest AI data company in China, with revenue almost double that of our nearest competitors.
Looking forward, the focus for Appen is on supporting the development and adoption of generative AI. There are major growth opportunities in both the model builders and companies looking to adopt generative AI into their products and operations. We feel we are just at the beginning of the largest AI wave.
Data quality plays a crucial role in AI model development. Could you share how Appen ensures the accuracy, diversity, and relevance of its datasets, especially with the increasing demand for high-quality LLM training data?
Appen’s strength is our ability to create high-quality data consistently and at scale. We work closely with our customers to understand their AI model objectives and develop high-quality data for their needs through a multi-layered approach that combines automated tools and human feedback. We have a global workforce of over 1 million across 200+ countries, which allows us to curate a group of qualified and diverse contributors. Through rigorous quality control and feedback loops, we ensure that the data is accurate, consistent, and relevant, and can be used to effectively improve the performance of AI models. This allows AI systems to operate effectively in real-world environments and can also be used to improve robustness and reduce bias, especially for LLMs.
Synthetic data generation is gaining popularity, and Appen’s investment in Mindtech highlights your interest in this area. Could you discuss the advantages and disadvantages of using synthetic or web-scraped data versus crowdsourced data for training AI models, and how you see synthetic data complementing the crowdsourced data Appen is known for?
High-quality data is crucial but can be costly and time-consuming to produce, which is why synthetic data is gaining attention. It works well for structured data in traditional AI/ML tasks, especially in industries with strict privacy regulations like healthcare and finance, as it avoids using personal information.
However, synthetic data often lacks the depth and nuance of real-world data, especially for complex Generative AI tasks that require diversity and deep expertise. It can also perpetuate errors or biases from the original data. Web-scraped data, commonly used for LLMs, presents its own challenges with low-quality content, bias, and misinformation, requiring careful curation.
Crowdsourced data, which Appen specializes in, remains the “ground truth.” Human expertise is vital for generating the diverse, complex data needed to improve AI model accuracy and ensure alignment with human values.
We view synthetic data as complementary to our human-annotated data. While synthetic data can accelerate parts of the process, human-labelled data ensures models reflect real-world diversity. Together, they provide a balanced approach to creating high-quality training data for AI.
The EU AI Act and other global regulations are shaping the ethical standards around AI development. How do you see these regulations influencing Appen’s operations and the broader AI industry moving forward?
The EU AI Act and similar global regulations are likely to influence Appen’s operations by setting new ethical standards for AI model development and performance. We may see changes in how we handle data, ensure model fairness, and address ethical considerations. This could lead to more rigorous processes and potential adjustments in our approach to model training and validation.
Broadly, these regulations will likely drive the industry towards higher ethical standards, increase compliance costs, and potentially slow down some aspects of innovation. However, they will also push for greater accountability and transparency, which could ultimately lead to more responsible and sustainable AI development.
With growing concerns around bias in AI, how does Appen work to ensure that the datasets used to train AI models are ethically sourced and free from bias, particularly in sensitive areas like natural language processing and computer vision?
We actively work to reduce bias by fostering diversity and inclusion across our projects. It is encouraging to see that many of our customers are focused on capturing broad demographics in data collection and model evaluation tasks. Having a global crowd that resides in most countries enables us to source data from a wide range of perspectives and experiences, which is especially important in sensitive areas like natural language processing and computer vision.
Since 2019, we formalized our best practices into the Crowd Code of Ethics, showing our dedication towards diversity, fairness, and crowd wellbeing. This includes our commitment to fair pay, ensuring our crowd’s voice is heard, and maintaining strict privacy protections. By upholding these principles, we aim to deliver high-quality, ethically sourced data that supports responsible AI development.
As AI becomes more integrated into industries like automotive, advertising, and AR/VR, how is Appen positioning itself to meet the increasing demand for specialized training data in these sectors?
Over the last 27 years, we have provided specialized training data for a diverse range of industries and use cases, and we continue to evolve as our customer needs evolve.
As an example, in automotive, we worked with leading automotive companies and in-cabin solution providers to build in-vehicle speech systems. Now, we are helping our customers in new areas like video data collection of drivers to help safety by monitoring driver distraction.
In advertising, we helped a leading global advertising platform improve the quality and accuracy of ads for user relevance over a large multi-year global program with 7M+ evaluations. Now, as many of the platforms are adopting generative AI solutions, our crowd are not only assessing the relevance of ads but also helping evaluate the quality of generated ads.
We have been able to do all of this through our robust annotation platform which can be customized to support complex workflows and various data modalities including text, audio, image, video, and multimodal annotation. But ultimately, our ability to move with the changing industry comes down to our deep expertise in data for AI development and strong partnership with our customers.
Appen has been a leader in providing high-quality data for a variety of AI applications. Looking forward, how do you see Appen’s role evolving as generative AI and LLMs continue to develop and influence global markets?
Generative AI and LLMs are transforming industries, and we will continue to play a critical role in providing high-quality data to support these advancements. When it comes to global markets, our ability to source across 200 countries and 500+ languages will become even more valuable, and we have a strong history of this as we helped companies like Microsoft launch Machine Translation models for over 110 languages.
As the deployment of LLM applications grows, we see a growing demand for aligning with human end users, including localization capabilities to ensure language and cultural nuances are addressed in various global markets. We’re committed to helping companies develop AI systems that are both performant and responsible by ensuring that the data used to train these models is diverse, relevant, and ethically sourced.
Appen is known for powering some of the world’s most advanced LLMs. What are some of the innovations in data annotation and collection that Appen is focusing on to enhance the performance of these models?
We’re continuously innovating our data annotation and collection processes to enhance the performance of LLMs. One area of focus is improving the efficiency and accuracy of data annotation through advanced AI-assisted tools, which help to streamline and automate parts of the process while maintaining high-quality standards.
We can identify data points that need further human input, ensuring that annotation efforts are targeted where they will make the most impact. We have integrated features in our platform like Model Mate which can be used to help accelerate data production and improve data quality. We are also focused on best practices in contributor management, which is important as the complexity of tasks increases.
The ability to understand contributor-level performance and provide feedback to continuously improve the quality of our human-generated data. These innovations allow us to provide the high-quality, large-scale data required to power and fine-tune the world’s leading LLMs.
As you step into your new role as CEO, what are your top priorities for Appen over the next few years, and how do you plan to drive the company’s growth in the highly competitive AI space?
As I transition into the role of CEO, my strategic priorities are designed to ensure Appen’s leadership in the competitive AI landscape:
- Supporting the development of generative AI models: Over the last 18 months, generative AI has become a key component of our service offering, with 28% of group revenue coming from generative AI-related projects in June 2024 compared to 8% in January. We see significant potential in the generative AI market, which is projected to reach $1.3 trillion by 2032 according to industry forecasts.
- Supporting the adoption of generative AI models: We see growth in new segments as enterprises leverage generative AI solutions for their use cases. Although the percentage of generative AI projects reaching deployment is low, we anticipate that FY24/25 will be a transition period where experiments move to production, and drive demand for custom high-quality and specialized data.
- Optimizing and automating the way we prepare data: By utilizing AI for quality assurance and automating certain steps of the data preparation process. This will allow us to enhance data quality while also improving operational efficiency, improving our gross margins.
- Evolving the experience for our crowd workers: Our new CrowdGen platform enables us to scale projects quickly and flexibly in line with our customer needs, utilizing AI for automated screening and project matching. This will also improve our contributor experience personalized support. Appen has been an early adopter in promoting transparency, diversity, and fairness in our data sourcing, and we remain committed to our Crowd Code of Ethics.
These priorities will position Appen for sustained growth and innovation in the evolving AI landscape.
Thank you for the great interview, we urge readers who wish to learn more to visit Appen.
Credit: Source link