Rabbit Hole Research

On Data, feat. Shayne Longpre | TRACES Appendix 38

Cristian Cibils Bernardes Season 1 Episode 38

In this conversation, Cristian and Shayne discuss the foundational role of data in AI and the challenges associated with data provenance and curation. They explore the organization and sourcing of data sets, the complexities of filtering and balancing data, and the legal and ethical implications of data usage. 

They also touch on the importance of transparency, accountability, and independent evaluation in the development of AI models. The conversation highlights the need for responsible data practices and the potential impact of AI on society. The conversation explores the protocols and challenges surrounding AI research and the need for infrastructure in the field. 

The discussion delves into the concept of safe harbor for good faith research and the importance of distinguishing between good and bad researchers. The conversation also touches on the changing landscape of the web and the impact on data access and consent. 

The enforceability of consent mechanisms and the complexities of copyright in the digital age are also discussed.

Find me at cristian@ccb.life

PRE-ORDER TRACES: A PSY-FI NOVEL NOW (https://ccblife.gumroad.com/l/traces)
Also, who are you? Get a draft of TRACES if you fill out this form (https://forms.gle/rFnVFrCNUAJz7Fvn7)

About the Guest:
Shayne Longpre is a PhD Candidate at MIT, where he works on training language models, and understanding their broader social challenges. In particular he investigates their risks, access and transparency, with an emphasis on training data. He leads the Data Provenance Initiative, and co-organized the AI safe harbor open letter (co-signed by 350+ researchers and journalists), advocating for better independent research access to closed models. His work has been covered by the New York Times, the Washington Post, and VentureBeat.

Set-Up:
- Camera: https://amzn.to/3PZVscb (don't laugh)
- Microphone: https://amzn.to/46f3pB5
- Teleprompter Stand: https://amzn.to/3tgS98y
- Telepromter App: https://amzn.to/46jdH31
- Teleprompter Screen:  https://amzn.to/3PNfKFI (yup)
- Headphones: https://amzn.to/46gMSwo