Generative AI Has an Intellectual Property Problem
As with any new technology, generative AI also hosts legal and ethical issues that must be addressed. One of the most important of these is the question of copyright, which determines who owns the rights to creative works and how to use them. Companies relying on generative AI without knowing the local legislation about generative AI copyright are risking reputation issues or legal fines.
To protect themselves from these risks, companies that use generative AI need to ensure that they are in compliance with the law and take steps to mitigate potential risks, such as ensuring they use training data free from unlicensed content and developing ways to show provenance of generated content. Generative AI refers to artificial intelligence algorithms (such as large language models) that can create new content based on data that the AI has been trained on. The learning process involves giving the algorithm many sample input-output pairs (a picture of a dog growling and the text “dog growling”) so that the internal parameters can be adjusted based on relationships between input and output that the algorithm infers. When the algorithm is given a new input from an end user, it can then use its already-tuned internal parameters to generate an output that reflects the training data.
New Jersey Legal Awards (NJLA) 2023
 On the related topic of software generation, it is also important to mention the class action lawsuit against Microsoft, GitHub, and OpenAI concerning the GitHub Copilot (reported here; case updates here). First and unrelated to Yakov Livshits copyright, they must comply with the separate transparency or information obligations outlined in Article 52(1). The provision then adds a number of additional information requirements regarding this human-AI system interaction.
As a counterpoint, some commentators note that these tools benefit many artists and content creators, whose interests should be considered when regulating how copyright law tackles these technologies. Others still are concerned that legal intervention at this stage would lead to market concentration and “make our creative world even more homogenous and sanitized”. Is the use of training data which includes copyrighted works a fair use, or does it infringe on a copyright owner’s exclusive rights in her work? The generative AI models used by companies like OpenAI, Stability AI, and Stable Diffusion are based on massive sets of training data. Furthermore, due to the size of the data sets and nature of their collection (often obtained via scraping websites), the companies that deploy these models do not make clear what works make up the training data. This question is one that is controversial and highly debated in the context of written works, images, and songs.
Bodybuilder and Mapplethorpe Model Lisa Lyon Dies at 70…
There is already significant scholarship in the EU that takes a critical view of these exceptions (see e.g. here). However, the emergence of generative AI and its clash with the copyright world, together with a favorable political landscape and timing, appear to have given the commercial TDM exception a new wind as a viable scalable policy option to tackle generative AI. Tech Policy Press is a startup nonprofit media & community venture that will provoke debate and discussion at the intersection of technology and democracy.
The piece was generated by Midjourney, a generative AI image tool, following prompts from artist Jason Allen. District Judge Beryl Howell said on Friday, affirming the Copyright Office’s rejection of an application filed by computer scientist Stephen Thaler on behalf of his DABUS system. Additionally, you can add another layer of protection by enabling GitHub Copilot’s optional duplication detection filter. If you do, Copilot’s suggestions won’t include exact or near matches of public code on GitHub.
AI-generated art cannot receive copyrights, US court says
Founder of the DevEducation project
A prolific businessman and investor, and the founder of several large companies in Israel, the USA and the UAE, Yakov’s corporation comprises over 2,000 employees all over the world. He graduated from the University of Oxford in the UK and Technion in Israel, before moving on to study complex systems science at NECSI in the USA. Yakov has a Masters in Software Development.
He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem’s work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business Yakov Livshits School. Besides the programmed algorithm, generative AI models rely on an immense number of data for creating new content. Under the copyright law of most countries, the creator of a work is generally considered the copyright owner. Such ambiguity can create problems in determining who has the right to exploit the work, and in enforcing copyright violations.
Both class-action lawsuits were filed by the Joseph Saveri Law Firm, which specializes in antitrust litigation. The firm is also representing the artists suing Stability AI, Midjourney, and DeviantArt for similar reasons. Last week, during a hearing in that case, US district court judge William Orrick indicated he might dismiss most of the suit, stating that, since these systems had been trained on “five billion compressed images,” the artists involved needed to “provide more facts” for their copyright infringement claims. If the model is trained on many millions of images and used to generate novel pictures, it’s extremely unlikely that this constitutes copyright infringement. The training data has been transformed in the process, and the output does not threaten the market for the original art.
These law suits claim that the use of artists’ or writers’ content, without permissions, to train generative AI is an infringement of copyright. Copyright Office issued a notice of inquiry (NOI) in the Federal Register on copyright and artificial intelligence (AI). The Office is undertaking a study of the copyright law and policy issues raised by generative AI and is assessing whether legislative or regulatory steps are warranted.
- Some creators and companies believe their content has been stolen by generative AI companies, and are now seeking to strip these companies of the protective shield of fair use in a series of pending lawsuits.
- While AI systems do not contain literal copies of the training data, they do sometimes manage to recreate works from the training data, complicating this legal analysis.
- Indeed, just this month, the Senate Subcommittee on Intellectual Property held its second hearing on AI and its implications for copyright law.
- Both class-action lawsuits were filed by the Joseph Saveri Law Firm, which specializes in antitrust litigation.
- For example, Wikipedia licenses the majority of its text to the public under two open-source license schemes.
In his motion, Thaler argued that this matter transcended quibbles between individual artists. Providing copyright protections to such artworks, he said, would inspire creativity, ultimately placing it in line with the intentions of copyright law. “In both our listening sessions and other outreach, the Office heard from artists and performers concerned about generative AI systems’ ability to mimic their voices, likenesses, or styles,” the Office wrote in its notice. “Although these personal attributes are not generally protected by copyright law, their copying may implicate varying state rights of publicity and unfair competition law, as well as have relevance to various international treaty obligations.”
The plaintiffs argue that the images produced by Stable Diffusion are derivative works of copyrighted images, thus infringing on the rights of the original image owners. Judge Orrick called it “implausible” that specific plaintiff works are involved due to the vast amount of training data. If the arguments from the defense hold, then there’s the matter of where those books came from. Several of the experts WIRED spoke to agree that one of the more compelling arguments against OpenAI centers on the secretive data sets the company allegedly used to train its models. The claim, appearing verbatim in both of the recent lawsuits, is that the Books2 data set, which the lawsuits estimate contains 294,000 books, must, by its very size, hold pirated material.
Various institutions and professions may well decide that using generative AI for certain tasks is “cheating.” An educational institution could adopt a policy that prohibits students from using AI to write class assignments (but the policy should be clear whether it applies to AI-based tools such as Spell-Check). At the same time, the institution could allow teachers to employ AI to create lesson plans, or could design its courses so that student use of generative AI is part of the expected tools students may use. That is the question at the heart of a class-action lawsuit against Open AI from a group of Yakov Livshits authors, including comedian Sarah Silverman, who claim their work was used to train ChatGPT without permission. Stephen Thaler, CEO of neural network firm Imagination Engines, has been at the forefront of the effort to establish copyright protections for AI-generated content, according to The Hollywood Reporter. Generative AI has generated questions in other areas of information policy beyond the copyright questions we discuss above. Fraudulent content or disinformation, the harm caused by deep fakes and soundalikes, defamation, and privacy violations are serious problems that ought to be addressed.