Launched in 2021 in Bengaluru, Karya claims itself as “the world’s first ethical data company.” The company sells data to big tech companies and other clients at the market rate. However, instead of keeping much of that cash as profit, it directs the money toward the rural poor in India. At the same time, keeping only what is required for floating its operations. Karya partners with local NGOs to ensure that its jobs reach the poorest of the poor, as well as historically marginalized communities. This way, Karya is able to pay its workers $5 hourly compensation, which is 20 times the Indian minimum. Karya also gives workers de-facto ownership of the data they create on the job. So, whenever it is resold, the workers receive the proceeds on top of their past wages.
The platform was co-founded by Manu Chopra and Vivek Seshadri and was later joined by Safiya Husain. The platform was formalised after the efforts of Chopra and Seshadri, who worked on the idea for four years at Microsoft Research. Based on extensive field studies the duo discovered the work of data generation could be done to a high standard of accuracy even with no formal training or any knowledge of the English language on the part of workers. They also established that the only requirement would be a smartphone. Thus, making it possible to reach not just city dwellers but the poorest of the poor in rural India.
Background
Recent years have seen the mushrooming of large language models like ChatGPT. However, they work best in languages like English, feeding from the large amounts of data on the language widely available on the internet. However, for several other languages including Indian languages, presence on the internet is lacking. Owing to this scarcity, there has been a substantial demand for datasets containing text or voice data in these languages. This demand stems from the tech companies aiming to enhance their AI tools and cater for a much broader audience.
Figures suggest that the AI data sector was worth $2 billion in 2022, and is expected to rise to $17 billion by 2030. But, very little of this money reaches the data workers. This process of data generation is wrought with the exploitation of workers, especially in poorer economies where this kind of work is sourced out. Experts suggest that this also reflects in the quality of the data generated. Karya aims to tackle these very problems with an ecosystem of ethical data usage where data can both financially and technologically empower communities.
The workers/ network partnerships
Karya makes use of a user-friendly application which follows a work-from-anywhere model. This allows anyone who owns a smartphone to be eligible to work with Karya. When the platform began its operations, it did not have a strict eligibility criterion. However, the team soon realised that the platform was not reaching the poorest communities. In order to ensure its reach to marginalised castes, genders and religions, the company made use of network partnerships and teamed up with local grassroots NGOs that distribute access codes which are in line with Karya’s income and diversity requirements. Currently, 200 such partnerships exist which handle the onboarding process for the platform.
Clients and process
Even though the platform is still very young, it has already amassed several high-profile clients such as MIT, Stanford, Microsoft and the Bill and Melinda Gates Foundation. Karya collects the data requirements of their clients and breaks down them into bite-sized tasks. These tasks once completed by workers, are collected and synthesized into high-quality AI training datasets.
Chopra informed Time that the workers involved in the process have only a very rudimentary idea of what AI is. And so, while explaining the process, they tell them that they have to “teach their mother tongue to the computer”.
Checks and balances
Karya is registered as a non-profit in the U.S. that controls two entities in India: one non-profit and one for-profit. The for-profit is legally bound to donate any profits it makes (after reimbursing workers) to the non-profit, which reinvests them. The structure is designed because Indian law prevents non-profits from making any more than 20% of their income from the market as opposed to donations. However, the platform does accept grants. Chopra told Time, that the arrangement has the benefit of removing any incentive for him or his co-founders to compromise on worker salaries or well-being in return for lucrative contracts.
Challenges
- The work under Karya is only supplementary in nature. The job is also temporary such that a worker can earn a maximum of $1500 (close to the average annual income in India) after which a new worker takes their place. Also, the current model is based on a system of grants for its operations, which can be threatened if the funds stop.
- As data suggests, the digital divide between rural and urban areas has been widening. The smartphone sales in smaller towns and villages stagnating at 35-40% of the total since mid-2021. In such a situation, the pre-requisite of having a smartphone to be an eligible worker also pushes several people from marginalized communities from availing the opportunity
- Chopra told Time that the biggest bottleneck to the company comes in the form of the amount of available work. He also stressed that large-scale awareness is required to promote ethical ways of data generation. Additionally, more tech companies and institutions should get their AI training datasets from such ways.
Impact and future goals
Several workers who are currently employed in Karya told Time that through Karya, they earn more in an hour than what they usually get from working in the fields for an entire day. Additionally, the work is easy to do as against the gruesome physical work that fieldwork entails. Since its inception, Karya has imbursed about 6.5 crores in wages to 30,000 workers in the country.
The organization aims to reach a target of 100 million people by 2030. Most importantly, the platform questions the exploitations in the data sector. In addition, shows the way to an ethical way to artificial intelligence, one that compensates the workers working in its lowest echelons.
Keep Reading
How are social media companies failing to stop misinformation on climate change?
What is climate change scam, why some people think it is a Hoax?
What is KissanGPT set to help Indian farmers?
How is Microsoft’s AI intervention helping Indian farmers?
Support us to keep independent environmental journalism alive in India.
Follow Ground Report on X, Instagram and Facebook for environmental and underreported stories from the margins. Give us feedback on our email id [email protected].
Don't forget to Subscribe to our weekly newsletter, Join our community on WhatsApp, Follow our Youtube Channel for video stories.