Time Travel: 2014

Chapter 82 Algorithms that are a generation and a half ahead

And what is Lin Hui doing, who is in the whirlpool of public opinion at this time?

Of course, I was working tirelessly to consult those reference materials that spanned time and space.

There is so much valuable information here.

Many things that were unremarkable in later generations are just random killings in this current time and space.

But Lin Hui was not swayed by his excitement.

Because Lin Hui always knows that half a step ahead is a pioneer, and one step ahead is a martyr.

It's okay to say that the technology is really ahead for a year and a half, but suddenly it's four to five years ahead of others, and there's a big problem.

Only technologies that can fit the background of the 2014 era are the valuable things Lin Hui is looking for.

It took almost no time to search through the ThinkPad in my previous life.

Lin Hui found his prey:

——Generate/extract composite news summary algorithm.

This algorithm was not particularly new in previous lives.

Lin Hui likes it because the algorithm is mature.

Maturity to a certain extent means stability and reliable performance.

Lin Hui directly used this algorithm to develop the news summary software he had previously conceived without any additional training.

Of course, this algorithm was not new in the time and space of the previous life.

In the time and space of 14, it is still a leading technology.

Is it useful to only be a little ahead?

Not to mention a little ahead, just half a point ahead.

It will still make you despair!

Before you broke my monopoly, I made huge profits by blackmailing you at high prices.

If you break my monopoly, I will just follow you and sell it at a bargain price.

Are you angry?

I don’t know if others are angry or not.

Anyway, the rabbit was so angry that he wanted to bite.

What's more, the iteration of things like algorithms is inherently fast!

Being one and a half years ahead is almost equivalent to being one generation ahead in terms of technology.

The generation/extraction composite news summary algorithm is one and a half generations ahead of the mainstream news summary algorithm in this time and space of 14 years.

This is not Lin Hui’s exaggeration.

In fact, the current method of automatically generating news summaries is extractive news summaries.

As the name suggests, the extraction method is to find the sentence or sentences closest to the central idea from the original news text according to a certain weight.

Extractive summarization still uses the old Text Rank sorting algorithm.

The general idea of ​​this algorithm is to first remove some stop words in the article.

Then the similarity of the sentences is measured, and the similarity score of each sentence relative to another sentence is calculated.

Propagate iteratively until the error is less than 0.0001.

Then sort the key statements obtained above in order to get the desired summary.

Objectively speaking, this algorithm is not bad.

But the problem is that extractive summarization mainly considers word frequency and does not consider too much semantic information.

Because of this, it is difficult for this kind of extractive summary to obtain the core content of complex news.

And an extremely obvious drawback of this summary method is:

Extractive summarization is reasonably suitable for English news.

But when it comes to Chinese news, I am completely at a loss as to what to do.

All in all, extractive summarization is currently relatively mature.

However, the extraction quality and content fluency are a bit lacking.

It is precisely because of the various shortcomings of extractive summarization.

Later, generative summarization algorithms appeared.

Generative summary algorithms benefit from in-depth research on neural network learning.

This summary generates summaries in a more human-like manner.

This requires the generative model to have stronger capabilities to represent, understand, and generate text.

The generative formula is that after the computer reads the original text and understands the meaning of the entire article, it generates a fluent summary according to the machine's own words.

Generative news summarization mainly relies on deep neural network structures.

Generative summarization has inherent advantages over extractive summarization in understanding news content.

But this summary is not without its drawbacks.

This summary method is easily restricted by the length of the original text.

When putting a long news article in front of a generative summarization algorithm.

The probability of its performance is: (⊙﹏⊙) It’s too long to watch!

The generation/extraction composite news summary algorithm combines the advantages of the extraction summary algorithm and the generative summary algorithm.

For longer news, this algorithm can be used to extract the core content first.

Then generate it based on the core content.

All in all, if a software is developed based on the generation/extraction compound news summary algorithm

It is also completely capable of defeating the software developed by Nick D'Aloisio.

After all, Nick developed the software.

Whether it’s Summly or Yahoo News Digest (Yahoo News Digest)

These are based on extractive algorithms.

The generation/extraction composite news summary algorithm can be said to beat the extractive summary algorithm in terms of efficiency.

But having said that, for such a stupid algorithm, you only develop a piece of software and sell it.

Seems like a bit of a loss.

After all, it is a technology that is ahead of its time.

It seems that I can publish a few papers or something.

Well, it seems a bit too shocking to publish a paper just after graduating from high school.

How can we make the most of everything?

Tap the screen to use advanced tools Tip: You can use left and right keyboard keys to browse between chapters.

You'll Also Like