You have a talent for finding clippings. I see this is from Trove. Do you have any particular search techniques you would like to share?
Well you can use their search right on the website, but you can also use Google from anywhere else as long as you specify the right parameters. This is a good prototype for pasting into a Google search box …
site:trove.nla.gov.au “antarctica”
NOTE-1: It is two terms
NOTE-2: No spaces in the first part.
NOTE-3: In the second part place whatever you want to search for between the quote marks, spaces are allowed there of course. BUT …
Newspapers are scanned, then they are OCR’d into actual text.
Sometimes the OCR process sucks. Then humans need to go through the text and correct it by eyeballing the original scanned newspaper against the OCR and fixing them manually. NOTE: People can really help correct these things and Trove makes this pretty simple. Open an account there and as you see these stories from Steve visit the article at Trove and if you see OCR errors, take some time to correct them. If errors are never corrected then obviously searches for plain text will not find those instances.
Another problem is that newspapers traditionally use a style that has many broken words with hyphens inserted ( this was the common technique to create justified text filling a column with equal characters per line before the modern era of automatically re-flowing text using variable sized fonts and especially spaces to avoid physical word breaks ). It seems to be the rule at Trove, quite understandably, to preserve the exact wording, breaks and all, in the OCR translated text. Consequently there are keywords that will be broken like “Amundsen” becoming “Amund-” CRLF “sen” or “Antarctica” becoming “Antarc-” CRLF “tica”.
This means that the more terms you place within the quotes to search for, the greater the chance there was a word break so the returned results will be less than complete. It requires a lot of thought when dealing with hyphens and phrases because you have to kind of anticipate what may have been actually typed into the newspaper years ago and what has happened in the text conversion since. Now Google does use a smart algorithm so sometimes it succeeds anyway, but in general it is best ( even for everyday searches ) to limit search terms to the highest priority words with as few total terms as possible.
I especially liked the 7th and 8th column where they talked about birds dropping dead from the heat, prolonged twilight because of atmospheric smoke from brush fires, and almost 80 years of rain fall data for Burra.
Articles like those in this newspaper are why I tell my steadily decreasing number of alarmist friends, “That there is nothing new under the Sun!” You need to get over yourself.
You have a talent for finding clippings. I see this is from Trove. Do you have any particular search techniques you would like to share?
Well you can use their search right on the website, but you can also use Google from anywhere else as long as you specify the right parameters. This is a good prototype for pasting into a Google search box …
site:trove.nla.gov.au “antarctica”
NOTE-1: It is two terms
NOTE-2: No spaces in the first part.
NOTE-3: In the second part place whatever you want to search for between the quote marks, spaces are allowed there of course. BUT …
Newspapers are scanned, then they are OCR’d into actual text.
Sometimes the OCR process sucks. Then humans need to go through the text and correct it by eyeballing the original scanned newspaper against the OCR and fixing them manually. NOTE: People can really help correct these things and Trove makes this pretty simple. Open an account there and as you see these stories from Steve visit the article at Trove and if you see OCR errors, take some time to correct them. If errors are never corrected then obviously searches for plain text will not find those instances.
Another problem is that newspapers traditionally use a style that has many broken words with hyphens inserted ( this was the common technique to create justified text filling a column with equal characters per line before the modern era of automatically re-flowing text using variable sized fonts and especially spaces to avoid physical word breaks ). It seems to be the rule at Trove, quite understandably, to preserve the exact wording, breaks and all, in the OCR translated text. Consequently there are keywords that will be broken like “Amundsen” becoming “Amund-” CRLF “sen” or “Antarctica” becoming “Antarc-” CRLF “tica”.
This means that the more terms you place within the quotes to search for, the greater the chance there was a word break so the returned results will be less than complete. It requires a lot of thought when dealing with hyphens and phrases because you have to kind of anticipate what may have been actually typed into the newspaper years ago and what has happened in the text conversion since. Now Google does use a smart algorithm so sometimes it succeeds anyway, but in general it is best ( even for everyday searches ) to limit search terms to the highest priority words with as few total terms as possible.
I’d love to know how much the ’30s were adjusted down in the GISS data whilst hiding the decline.
I especially liked the 7th and 8th column where they talked about birds dropping dead from the heat, prolonged twilight because of atmospheric smoke from brush fires, and almost 80 years of rain fall data for Burra.
Articles like those in this newspaper are why I tell my steadily decreasing number of alarmist friends, “That there is nothing new under the Sun!” You need to get over yourself.