I think this year and the next will be a time for big breakthroughs in machine generated content. Each day there are more and more websites added to the net who live and die according to how many “views” they receive. They key to getting more views is to have more and more content. The problem is, up till now you needed a writer to create these stories. While a good writer can write several stories a day, they still have a limited output. They also cost money. They don’t get well paid to be sure, but it’s still not zero. Therefore, websites are now starting to look at having a computer program start writing articles automatically.
This article from Kurzweil AI talks about how several news services such as Narrative Science produce automated news stories almost constantly. Here are some good quotes:
The articles run on the websites of respected publishers like Forbes, as well as other Internet media powers (many of which are keeping their identities private).
For Narrative Science’s CTO and cofounder, Kristian Hammond, these stories are only the first step toward what will eventually become a news universe dominated by computer-generated stories. How dominant? “More than 90 percent,” says Hammond.
Narrative Science’s writing engine requires several steps. First, it must amass high-quality data. Then the algorithms must fit that data into some broader understanding of the subject matter. (For instance, they must know that the team with the highest number of “runs” is declared the winner of a baseball game.) So Narrative Science’s engineers program a set of rules that govern each subject, be it corporate earnings or a sporting event.
But how to turn that analysis into prose? The company has hired a team of “meta-writers,” trained journalists who have built a set of templates. Then comes the structure.
Hammond believes that as Narrative Science grows, its stories will go higher up the journalism food chain — from commodity news to explanatory journalism and, ultimately, detailed long-form articles. Maybe at some point, humans and algorithms will collaborate, with each partner playing to its strength.
Last week I wrote From Robo-Reporters to Robo-Teachers? and Video About Automatic Content Creators. As I keep reading more about this I can start to see the shape of how this is going to work. Let’s look the conditions that need to exist for machine generated content to flourish.
- You need a lot of high quality data. For example, with a dataset of all the pitches in a baseball game, a computer program can create a decent article summarizing the highlights.
- The result will be more fact based stories. In fact the weakness of the system would be in analysis and opinions.
- Right now the stories are being presented to all users but it should be possible to rewrite the stories based on user preferences. Imagine you attend a little league game and your child is playing. The story writer could make your child the focus of the article. Is there a market for that? Maybe so.
- There is still a need for humans to teach the computer the context of the material. Even so, you can look at this as a single human producing hundreds or thousands of articles. This is the empowerment of creative individuals I have been noting.
Can we apply this to education? I think the answer is obviously yes. Classes in school are narrowly focused and fact based. Textbooks are even more so. How long until we see machine generated textbooks, personalized for each student based on their proven learning styles? It will also allow creation of the vast amount of content that would be necessary for adaptive learning systems. Add to this the big push to capture more and more data about students.
Our society is becoming more and more automated. Previously these were purely mechanical tasks but now computers are starting to encroach into more creative and intellectual fields. This will change society even more. Can schools change fast enough to keep up? Do they even realize what is happening? Time will tell (quickly).