Live
- 6 held; 21 red sanders logs recovered
- Jagan will soon be counting bars: BJP
- Brahmotsavam begins in Thummalagunta temple
- Afghanistan: Three killed in unexploded ordnance blast
- HM exhorts people to support ‘Swarnandhra-2047’
- Making active learning a central pillar of students' education
- Vizag port achieves new milestone in cargo handling
- Toll gate at Karapadu draws ire of local residents
- Essential tips & insights to ace CUET exam
- Gold rates in Hyderabad surges today, check the rates on 05 October, 2024
Just In
OpenAI's GPT-4 Trained Using YouTube Videos: Details
OpenAI reportedly leveraged millions of hours of YouTube videos to train its formidable GPT-4 language model.
OpenAI's unveiling of the GPT-4, touted as its most robust large language model to date, marked a significant leap in AI capabilities. The model's proficiency was showcased through impressive scores on various exams, indicating its advanced linguistic understanding. However, recent revelations shed light on OpenAI's unconventional training approach.
A report by The New York Times disclosed that OpenAI encountered data scarcity while developing its Whisper audio transcription model. To address this challenge, the company transcribed over a million hours of YouTube videos to train the GPT-4 language model, a move with legal ambiguities.
OpenAI President Greg Brockman reportedly spearheaded the sourcing of these videos, signalling the company's innovative methods in dataset acquisition. With conventional data sources depleted by 2021, discussions ensued about leveraging YouTube videos, podcasts, and audiobooks for training purposes.
In response to inquiries, OpenAI's spokesperson, Lindsay Held, emphasized the company's commitment to curating diverse datasets for each model to enhance its comprehension and competitiveness. Held also highlighted the utilization of public data and partnerships and explored synthetic data generation.
To recall, OpenAI's blog post introducing GPT-4 read, "We've created GPT-4, the latest milestone in OpenAI's effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks."
While speculations about the emergence of GPT-5 have circulated, OpenAI has not provided official confirmation regarding its launch timeline. Nonetheless, CEO Sam Altman has expressed ambitions for even more potent language models in the future, underscoring OpenAI's relentless pursuit of AI advancement.
© 2024 Hyderabad Media House Limited/The Hans India. All rights reserved. Powered by hocalwire.com