Hace poco hablaba sobre si las IAs se convertirían en los próximos artistas, si serían capaces de escribir novelas o poemas… Hoy os presento a GPT-2 una IA capaz de crear textos a partir de una idea básica. Dicen que es tan fiable, tan humana, que la organización que lo ha desarrollado, OpenAI, asesorada por sus benefactores, entre los que está Elon Musk (Tesla & Co), Peter Thiel (PayPal y ahora inversor) o Reid Hoffman (Linkedn), han decidido no compartir su código al completo con la comunidad de desarrolladores.
OpenAI es una empresa sin ánimo de lucro, dedicada a la investigación y desarrollo de IA que sean seguros y no supongan una amenaza a las libertades y a la sociedad.
Un sistema IA capaz de generar textos que puedan pasar por humanos, podría ser una máquina para crear fake-news, manipular campañas políticas o crear impostores digitales.
Varias pruebas de lectura han demostrado que el sistema produce un lenguaje correcto al 93,3% en la escritura de libros infantiles. Cuanto más abundante es la información respecto a un tema, mejor es el resultado. Por el contrario, cuando se trata de “tipos de contenido esotérico o altamente técnico, el modelo puede comportarse pésimamente”.
El sistema es un destacado ejemplo del llamado “aprendizaje no supervisado”: un mecanismo capaz de absober grandes cantidades de información sin gestión humana; uno de los pilares en el desarrollo de la inteligencia artificial.
Better Language Models and Their Implications
We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization — all without task-specific training.
Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper.
GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than 10X the amount of data.
GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation. In addition, GPT-2 outperforms other language models trained on specific domains (like Wikipedia, news, or books) without needing to use these domain-specific training datasets. On language tasks like question answering, reading comprehension, summarization, and translation, GPT-2 begins to learn these tasks from the raw text, using no task-specific training data. While scores on these downstream tasks are far from state-of-the-art, they suggest that the tasks can benefit from unsupervised techniques, given sufficient (unlabeled) data and compute.
Un comentario en “GPT-2 – OpenAI”