AI Insights Archives — N2 Communications

Introducing N2+AI

Posted on June 1, 2024 - 1:11 am by Joe Flood

“I automate whatever can be automated to be freer to focus on those aspects of music that can’t be automated. The challenge is to figure out which is which.” –Laurie Spiegel, coder, composer, badass.

Create an image that captures the transition from traditional media to AI-powered content creation. The image should depict a timeline or evolution, starting with a traditional newsroom setting on one side and transforming into a futuristic AI-driven content hub on the other. Include elements like journalists with notebooks and cameras on the left, and on the right, advanced AI interfaces, digital screens, and data streams. — AI images: pretty fun for ideating, not a self-driving brand creation tool. Image generated by Dall-E via Poe. (See footnotes for the full prompt.)

I started N2 a decade ago because the internet was changing the usual way of doing things. Today, we’re launching N2+AI in a similar moment.

Rewinding to the mid-2010s, traditional media models were collapsing and the long-form, impactful journalism jobs that I and so many of my peers aspired to were vanishing. At the same time, I’d never been so busy with corporate clients. These companies—largely in tech and finance—needed precisely the kinds of talented folks who were getting laid off from newspapers and magazines, or grossly underpaid as teachers and researchers. The tricky piece was bringing the sides together in a way that worked for both. In 2015, I started N2 Communications to solve that problem. We’ve been evolving and refining our model ever since.

Today, the AI revolution is fast remaking how we create, consume, and even conceive of content. Our new venture—N2+AI—builds on what my team and I have learned over the past decade and tailors it for what’s next. We are extremely proud to announce that we’re partnering with Das Rush to spearhead our AI team. For the past 10 years, Das has led content initiatives for some of Silicon Valley’s most influential investors, tech companies, and startups, most recently as head of content for a16z Growth, the late-stage fund at Andreessen Horowitz.

Das has been a friend since college, and we’ve been talking about AI over desert campfires and over-caffeinated Zoom calls for years. Gradually those conversations have turned into hypotheses, syntheses, and now into a full-blown set of services and a crack team of AI implementers. We’re here to help innovators bring their software to investors and the market, and to help companies and creators use those tools efficiently, responsibly, and creatively. With years of experience in writing, tech, and AI, I think Das is the best person in the business to bring it all together.

Our AI thesis is simple: your content is now your data, and to have the best and most trustworthy data, you need to invest in content experts (read: writers, researchers, artists, etc.) along with AI engineers. Sometimes that means creating great content so that your language models have the best data to train on—otherwise it’s garbage in, garbage out. Other times that means developing smart, responsible AI policies and principles and communicating them to your customers, clients, employees, and investors. Sometimes it means using the best editors, producers, and fact-checkers to conceptualize and clean up AI-created content; and sometimes it means knowing when the right human is more artful and efficient than the algorithms.

We are now seeking beta customers for a set of services we have developed to run AI content initiatives end to end both for startups building today’s AI and for the enterprises implementing it. These services span the ideation, development, and ongoing improvement of generative AI and LLM strategies.

1. Ideation – map your AI journey

We begin all AI engagements with a consultation to develop a plan to:

Build strategies and messaging that stand out and educate the market
Identify AI use cases, both internal and external
Set editorial policies for safe and effective models and applications that users can trust
Capture your expertise in proprietary content that will differentiate your AI
Use AI for creating and updating high-quality content, like pitch decks, videos, podcasts, and thought leadership

2. Development – turn your content into data

Once the strategy and policies are in place, we develop the content and then curate it into datasets for:

Low-code/no-code chatbots
Retrieval augmented generation (RAG)
Finetuning
Language model training

3. Ongoing Improvement – test and iterate your AI

We have the skilled editors, content moderators, and language experts to:

Redteam your AI to improve it before release
Monitor your AI for adherence to your editorial guidelines
Curate prompts and responses into new datasets to keep improving your AI

Das and our AI team have produced a series laying out our views on AI, starting with the role creatives play in developing best-in-class AI and a primer on LLMs. We’re excited to expand on this series as the space evolves.

We approach this moment with curiosity, humility, and our eyes open to what we don’t know. We also approach this moment with the belief that N2’s tested model and vast talent pool position us to serve our clients well, whatever may lie ahead.

To hear more about N2+AI, please reach out to ai@n2comms.com, and be sure to watch this space as we share more from Das and our team.

Onwards,
Joe Flood
N2 founder and CEO

Edited by Chris Edmonds, N2 co-founder and COO

Feature image prompt (with help from GPT-4o): Create an image that captures the transition from traditional media to AI-powered content creation. The image should depict a timeline or evolution, starting with a traditional newsroom setting on one side and transforming into a futuristic AI-driven content hub on the other. Include elements like journalists with notebooks and cameras on the left, and on the right, advanced AI interfaces, digital screens, and data streams.

Like this post? Check out other pieces in our content series:

Interested in working with us? We are currently looking for beta customers for our AI content services. Reach out to ai@n2comms.com.

Writers in the Loop

Posted on June 2, 2024 - 6:35 pm by Joe Flood

“A company asked why it was so hard to hire a good writer. I told them it was because good writing is an illusion: what people call good writing is actually good thinking, and of course good thinkers are rare.” –Paul Graham, Y Combinator Founder, on X

Create an image of "Writers in the Loop" that visually represents a harmonious collaboration between human writers and AI. Emphasize the concept that content is now data. Show data streams or binary code integrating with written text, symbolizing how human-generated content fuels AI models. — Image generated by Dali-E-3 via Poe. (See footnotes for full prompt.)

Dark headlines have conditioned us to believe that AI is going to automate creatives out of existence.

As working creatives ourselves, we share these worries. We know that generative AI has the potential to upend the livelihoods of folks like us: writers, editors, and researchers, but also poets, painters, actors, and so on.

But we also believe that the fears that AI will wholesale replace the creative community grossly underestimate the value of human creativity in AI. The reality is that AI needs creatives of all stripes in the loop to turn technological capabilities into real-world value.

This view isn’t just wishful thinking. It’s what we can learn from the long intertwined history of technology and business. For all the buzz and utopian vs. doomsday headlines, AI is just a really powerful data tool. The difference with this latest data revolution is that the data isn’t the traditional “structured data” of columns of numbers, it’s the “unstructured data” of content: stories, videos, illustrations, music, dance, art. Until recently, this content was deemed too low-value to be worth the cost to capture, store, and use as data. Generative AI and LLMs have changed that equation.

Whenever there is a big leap in data technology, the countries and companies that wield the best data generally win. Renaissance Italian merchant cities reintroduced double-entry bookkeeping to Europe and dominated the continent’s banking and trade for centuries. Efficiency obsessives like Andrew Carnegie and John D. Rockefeller built their Gilded Age empires more with adding machines, typewriters, and telegraphs than with ruthless Robber Baron tactics. Harvard Business School used standardized testing to find the “Whiz Kids” GIs who helped the Allies win World War II, kick-started the Computer Age, and turned America into an economic superpower in the process.

By the end of 2025, investment in AI is expected to approach $200 billion, and most Fortune 500 companies plan to increase AI budgets 2-5x. Less clear is where and how to do that safely and effectively. Just as frequently as we read about new AI startups and breakthroughs, cautionary headlines warn of LLMs prompted to reveal sensitive data, chatbots hallucinating generous refund policies that companies actually have to pay for, and model responses that reinforce discriminatory biases.

Quality content and skilled creators are critical for turning technological promise into real use cases and economic value—and to avoid the pitfalls and lawsuits.

The reason is simple: a language model is only as good as the data it is trained on, and better content is better data. To illustrate, about a year ago, I sat in on a private demo of a company’s new AI Agent.¹ Users could ask a question, and the agent would respond in natural language followed by a list of relevant links and citations from the company’s proprietary content stored in a local database.

A single engineer had worked part-time on the project² and had it up and running in a few months. The real value of the company’s agent didn’t come from the replicable technology; it came from the years of proprietary content. The technology gave users a better way to access that content, but the content itself was what drove value that no competitor could copy.

More recently, LinkedIn founder and venture capitalist Reid Hoffman created an impressive AI twin of himself. The technology used includes synthetic audio from Eleven Labs and a video avatar by Hour One, but the real magic is in the content that trained the AI. As Reid says in the video description: “[the] persona—the way that REID AI formulates responses—is generated from a custom chatbot built on GPT-4 that was trained on my books, speeches, podcasts and other content that I’ve produced over the last few decades.”

We might think of Reid Hoffman as a technologist, but he’s also a very good writer and storyteller who has spent thousands of hours and probably (many) thousands of dollars on editors, podcast producers, and other creatives to hone his skills and clean up his content. It’s this content—the vast troves of unstructured training data—that drove the quality of Hoffman’s AI twin.

One of the most common—and dangerous—misconceptions about AI is that it is about to replace human writers, the content creation version of fully autonomous driving. Almost every week we hear an executive saying that “AI can write all our content.” AI can certainly generate huge amounts of content quickly and cheaply, but the highest quality content still needs writers in the loop for three reasons.

1) Human writers teach AI “what good looks like” for different use cases.

Real Reid Hoffman can’t just give REID AI a destination (“Write me a great speech!”) and let it navigate on its own. He needs to give it a lot of very specific guideposts. That’s the thinking part of writing from the Paul Graham quote I opened this piece with. Even if AI is part of creating content, you need humans to give context and tell the AI what voice to mimic, who the audience is, what is and isn’t an authoritative source, whether the speech should be funny or serious or both, and so on.

Screenshot from Poe, a platform where anyone can build an AI bot by describing what you want it to do – no coding required. Of course, the bot is only as good as the instructions it’s given.

2) Human writers help avoid model collapse and keep improving AI performance.

Without regular infusions of new human-created content, AI content goes from average to awful pretty quickly, a phenomenon known as model collapse. It’s the AI equivalent of a copy of a copy of a copy, each instance a slightly lower resolution than the previous one, until the audience has no idea what they’re looking at.

To return to our REID AI example, actual Reid will need to constantly give it good new content and smart new thoughts to train on. Otherwise REID AI will quickly turn into a poor approximation, a behind-the-times Hoffman—and poor and behind-the-times are the last things you want from a technology investor.

3) Human writers make it possible to copyright AI-generated content.

While AI laws and precedents are still being set, content currently generated by an AI without a human hand transforming it cannot be copyrighted, a topic we’ll cover in more detail in a future post.

Some companies already possess quality, copyrightable content to build solid AIs with. Take Adobe. A year ago, investors were questioning whether AI posed an existential risk to the company. Now, Adobe has launched its own AI tools based on its existing library of hundreds of millions of well-tagged and organized stock photos. Not every company has such a treasure trove of content, but they do have internal communications, technical documentation, community forums, Slack threads, or thought leader executives (get ready for the folksy Warren Buffett Annual Letter-bot!). Still others may have their most valuable information locked in the minds of internal experts who just need a skilled interviewer and writer to bring it to the fore. And for companies that have none of the above, they’re going to need to roll up their sleeves, do some hard thinking and good writing, and start building ideas and stories that are worth training an AI on.

But where you’re starting from is far less important than the fact that you’re starting at all. Businesses that invest in quality content will have an enduring data edge, no matter how the technology, market, laws, and protocols evolve. Those who make the writers and editors and artists who create this content a core part of their AI strategy will also have a talent edge. And in a race as frenetic and relentless as the current AI boom, every advantage counts.

Footnotes:

An AI Agent can perform specific tasks without human intervention. In their current form, they can be thought of as super powerful chatbots, though advances in robotics mean AI Agents may take on more embodied forms in the future.
This particular AI Agent was set up with a RAG architecture, so the bot’s language skills were powered by ChatGPT, but the domain expertise and results were based on a set of content materials stored in a vector database, where they could be easily updated.

Written by Das Rush, N2 AI Strategist
Edited by Joe Flood, N2 founder and CEO

Feature image prompt: Create an image of “Writers in the Loop” that visually represents a harmonious collaboration between human writers and AI. Emphasize the concept that content is now data. Show data streams or binary code integrating with written text, symbolizing how human-generated content fuels AI models.

Like this post? Check out other pieces in our content series:

Interested in working with us? We are currently looking for beta customers for our AI content services. Reach out to ai@n2comms.com.

LLMs 101

Posted on May 2, 2024 - 6:34 pm by Conrad Julian

“AI is the science and engineering of making intelligent machines.” -John McCarthy, the Stanford computer science professor who coined the term “artificial intelligence” in 1956

Create an illustration that visually represents the three core components of a language model: the model (symbolized by a neural network or transformer architecture), compute power (symbolized by GPUs or a data center), and data (symbolized by an interconnected web, or icons representing articles, podcasts, videos, books, and images). The illustration should convey the synergy between these elements, highlighting their role in the development of modern NLP and AI applications. Use a futuristic and cohesive design to emphasize the cutting-edge nature of these technological advancements. Do not include words in the image. — Image generated by Dali-E-3 via Poe. (See footnotes for full prompt.)

In November 2022, engineers at OpenAI were agonizing over a decision to release a new chatbot.

They hoped the release would create a flywheel: more people would use their AI, which would mean more user feedback, which would mean more rapid improvement to the model, which would attract even more users. At the time, however, the company’s executives were worried the chatbot wasn’t good enough for public release. The AI in question was ChatGPT.

Within two months of launch, it was the most viral technology in history, with 30 million users and a $10 billion valuation from Microsoft. And depending on your vantage, it either felt like an overnight revolution or a decades-long evolution.

To much of the public, the tool’s capabilities were so impressive they seemed almost like sorcery. But for AI researchers, ChatGPT was just the latest iteration of a research field known as natural language processing, or NLP. If AI is, broadly, the science of making intelligent machines, then NLP is the discipline that tries to teach the intelligent use of language to those machines. Since the field of AI emerged around 1950,¹ NLP has been a mostly underappreciated and underfunded field of inquiry, with only a handful of researchers in academic and big tech research labs who believed we could teach computers to read and write at human levels.²NLP is underfunded no more.

Like all AI, a language model has three components: a model, compute to power the model, and the data the model learns from. Technological advances around each of these three components—GPUs (compute), transformer architecture (models), and the internet (data)—have come together to make ChatGPT and the current wave of LLMs and AI applications possible.

In this primer, we’ll walk through each of these breakthroughs as we explain how language models work. More importantly, we’ll explain why quality data—specifically human-generated content like articles, podcasts, videos, books, and images—will be the difference between success and failure in an AI-enabled world.

The Compute Breakthrough: Graphics Processing Units

Compute is processing power.³ The more processing power, the bigger the model can be and the more data it can process. Whether cloud or a local device, compute is essentially a cluster of microchips that performs the basic computing processes that make everything else work.

The breakthrough to more powerful chips, and therefore greater compute, traces back to the video game boom of the 1990s.⁴ At the time, the only chips that existed were serial processors, which had the capacity to execute a single task at a time. These serial processors steadily improved—my 10-year-old mind was blown when I upgraded from the original 8-bit Nintendo to a 16-bit Sega. But to get from Street Fighter to The Last of Us, developers needed a chip capable of splitting up and running multiple, complex processing tasks at the same time. A startup called Nvidia⁵ met that need by designing a new type of chip—the GPU.

These GPU chips made it possible to process a lot more data, a lot more efficiently. And as interest and investment in language models have exploded, so has demand for GPUs. That demand has led to a global chip shortage and made compute perhaps the biggest cost and bottleneck to advancing AI models and adoption. (The dearth of GPUs even drove some desperate engineers to repurpose old video game chips as a substitute.)

But that shortage is easing as more public and private funding goes into chip manufacturing. Nvidia is ramping up production and, in March 2024 at its GTC conference, announced a new generation of AI chips said to be more than 30 times more powerful than the current H100s.

With more compute, the next bottleneck to AI is likely to be high-quality training data. However, let’s not get ahead of ourselves. First, we need to talk about models.

The Model Breakthrough: Transformer Architecture

Crack open any AI model and, at its core, it’s statistics and probability. Models learn statistical patterns from a given dataset and then apply those lessons to any new data they encounter. To do that with any kind of efficiency, though, required an innovation: a deep learning architecture known as a transformer architecture.

The transformer architectures⁶ that are the basis of LLMs, like GPT, Claude, and Gemini, were first described in the influential 2017 paper “Attention Is All You Need.” The paper proposed an architecture that could take advantage of the ability of GPUs to parallel process information and effectively predict the next word. With this architecture and sufficient compute, a language model could train on huge amounts of text, identify relationships and patterns, and then use those relationships and patterns to predict the next word and generate responses to prompts.

How exactly LLMs do this is incredibly complex. Fortunately, it’s not important to be deeply versed on things like model weights, parameters, backpropagation, and forward feeding to effectively use AI, much like you don’t need to understand the physics behind an internal combustion engine to drive a car.

It is, however, critical to understand three core concepts of what transformer-based language models do: 1) convert language into math, specifically vector math, 2) predict the next word, and 3) improve exponentially with more data and bigger models, a principle known as scaling laws.

1) Convert language into math

Language models convert words—or “tokens,” which are often words, but can be parts or series of words—into vectors,⁷ through a process known as word embeddings, or word2vec. A vector can be represented as a set of numbers that tell you the location, size (or “magnitude,” in mathematical terms), and direction of the vector. This essentially converts words, a language machines don’t understand, into math, a language that they do. This mathematical representation allows models to calculate relationships between vectors and use those calculations to predict the next word.

For a deeper explanation of how vectors and language models work, check out this fantastic Ars Technica explainer.

2) Predict the next word

Whenever the model is prompted, it responds by predicting the next word repeatedly. For example, if you train a model on a bunch of lullabies and then ask the AI to complete the line, “twinkle, twinkle, little ___,” it will analyze the patterns across the lullabies and most likely predict “star.”

That prediction isn’t guaranteed because AI is not like traditional software. So, what’s the difference between AI and traditional software?

Traditional computers are deterministic. You code an input, and the computer will always provide that input in response when confronted with a given prompt. LLMs, however, are probabilistic, driven by statistics. The answers a model generates and the words it predicts change, even if the prompt remains the same. Think of it as rolling 10 different dice 10 times. The dice and the action of rolling are the same each time, but the combination of numbers invariably changes.

Like a fancier version of a bell curve, an LLM’s potential answers fall on a probability curve. The exact shape of that curve is controlled by a model’s weights and parameters. For instance, a parameter known as “temperature” controls how random the outputs are. Temperature was initially a value of 0 to 1, but recently the scale was extended from 0 to 2. At 0, there is no randomness to the next word that the model picks, making responses more deterministic and uniform, but also less creative and interesting. At 2, the model takes more risks picking the next word, resulting in more random outputs, more creativity, and more hallucinations.

3) Scaling Laws

Initially, transformer-based models didn’t seem all that impressive, except to the researchers who understood the technology. As I once heard Dario Amodei, founder of Anthropic and previously VP of Research at OpenAI, explain to a room full of AI builders:

When we put out GPT-2, some of the stuff that was considered most impressive at the time was, “Oh, my God. You give these five examples of English to French translation. Just offer it straight into the language model. Then you put in a sixth sentence in English and it actually translates into French. It actually understands the pattern.” That was crazy to us, even though the translation was terrible. It was almost worse than if you were to just take a dictionary and substitute word for word.

Our view is that this is the beginning of something amazing because there’s no limit and you can continue to scale it up. There’s no reason why the patterns we’ve seen before won’t continue to hold. The objective of predicting the next word is so rich and there’s so much you can push against that it just absolutely has to work. Some people looked at it and were like, “You made a bot that translates really badly.”

What Amodei and researchers saw in GPT-1 and GPT-2, and the rest of us discovered with ChatGPT, was the power of scaling laws. These laws essentially say that performance improves with a bigger model in the form of more parameters, more compute, and…more data.

The Data Breakthrough: The Internet
Even with the most sophisticated deep learning architecture and the most powerful processing available, a language model is only as good as the data—the words—that it is trained on.

ChatGPT, for better or worse, was trained on the world’s largest dataset: the internet. To keep improving, especially as more models come to market, the biggest LLMs—OpenAI’s GPT, Anthropic’s Claude, Google’s Gemini—are ravenous for data. What’s more, if AI scaling laws hold, the demand for data— specifically, “unstructured data”—will only increase.

Structured data is what we typically picture when someone says “data”—it’s the rows and columns in a spreadsheet. Unstructured data is data without a clear and consistent format. It’s Slack messages, Zoom recordings, emails, GDocs, and images. (There is also semi-structured data, but we won’t get into that here.) Before LLMs, unstructured data was too difficult and expensive for most companies to bother capturing and storing.

To put numbers to it: today, unstructured data accounts for 90+% of all the data that companies generate, but less than 2% of that is captured and stored. The amount they are generating (and capturing and storing) is rapidly increasing. But just because they have the data, doesn’t mean they are willing to give it to models to train AI on.

This leaves LLMs with two big problems: 1) they are running out of data, and 2) the data they have is of very mixed quality.

LLMs need more data. Data is important enough that LLM builders will go to great lengths to get it—with or without explicit permission.⁸ But the publicly available data is running out, possibly as soon as 2026. How these LLMs will acquire the content they crave is the open question, and much of the answer will be determined in legislatures and courtrooms. (We cover these in more detail in an upcoming post.)

LLMs need better data. As any human who has ever logged on to the internet knows, it’s a mixed bag that contains both the totality of humankind’s collective knowledge, and an equal or greater amount of junk—expired eBay listings, controversial subreddits, derelict MySpace pages, and more nefarious sources of misinformation and overt and covert bias and discrimination.

As LLM hallucinations⁹ frequently remind us, even the most powerful AI models are subject to the old rule—“garbage in, garbage out.” If a model is trained on low-quality data, it will be more likely to produce low-quality results. Put another way, if you train a model on the entire internet, you can weight the model to prefer Wikipedia over 4Chan, but the trolling is still somewhere in the model’s probability curve.

A flood of talent and funding has led to new finetuning techniques and retrieval augmented generation (RAG) that make it possible to customize AI and improve performance and accuracy on a specific use case. It is more technical than we’ll get into here, but we like this explanation from Snorkel AI that compares finetuning to a doctor updating his expertise and RAG to checking a patient’s chart. Both techniques can be, and increasingly are, used together to optimize performance. These techniques use content in different ways, but they all require content to teach the AI what good looks like.

AI is intelligent, but it isn’t omniscient. If you can’t define good, you can never train the AI to get there. And defining good is really just telling the AI what good content—articles, videos, podcasts, and graphics—to focus on. What defines good content hasn’t changed: it’s content that knows its audience, backs up claims with authoritative sources, has a voice and style, uses the medium to support the message, and is regularly reviewed and updated. Who creates good content hasn’t changed either: it’s the best human creatives, not bots.

Footnotes:

Some people date the birth of AI as a field to the late 1800s and the work of Ada Lovelace; others to 1950, when Alan Turing released his foundational paper Computing Machinery and Intelligence; still others to the 1956 Dartmouth summer conference when the term “artificial intelligence” was coined.
By “human levels” we mean that generally people view ChatGPT as about as good as the average human.
For AI, this processing power is often measured in floating point operations, often shortened to FLOPS.
Interesting note: games are often where cutting edge infrastructure is built. First, it seems we build it for World of Warcraft, Elder Scrolls, and Zelda, then we sell it to the Fortune 500.
Nvidia launched their original GPU in 1999, the same year they went public with a stock valued at $12 per share.
For simplicity, we focus this primer on transformer architectures. There are a number of other types of AI models that are part of the current wave, including diffusion architectures that apply similar principles to image and video, rather than text.
For a better technical explanation of vectors: https://mathinsight.org/vector_introduction
While OpenAI does not disclose exactly what comprises its training dataset, it generally describes it as “all publicly available data on the internet.” In an investigative piece in April 2024, the New York Times reported that OpenAI built a speech recognition tool to help transcribe YouTube videos to serve as training data.
A hallucination is when a model presents invented data as fact.

Written by N2+AI’s Das Rush, Madelyn Goodman, and Tessa Stuart

Edited by N2 CEO Joe Flood and COO Chris Edmonds

Feature image prompt: Create an illustration that visually represents the three core components of a language model: the model (symbolized by a neural network or transformer architecture), compute power (symbolized by GPUs or a data center), and data (symbolized by an interconnected web, or icons representing articles, podcasts, videos, books, and images). The illustration should convey the synergy between these elements, highlighting their role in the development of modern NLP and AI applications. Use a futuristic and cohesive design to emphasize the cutting-edge nature of these technological advancements. Do not include words in the image.

Like this post? Check out other pieces in our content series:

Interested in working with us? We are currently looking for beta customers for our AI content services. Reach out to ai@n2comms.com.

Risky Business

Posted on February 28, 2024 - 12:07 pm by Conrad Julian