tag:blogger.com,1999:blog-8929346053949579231.post4895283433829677323..comments2024-03-23T00:59:24.057-04:00Comments on Sapping Attention: Poor man's sentiment analysisBenhttp://www.blogger.com/profile/04856020368342677253noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-8929346053949579231.post-52160383784691240042012-02-12T10:26:40.769-05:002012-02-12T10:26:40.769-05:00Thanks for this link - it looks perfect and I will...Thanks for this link - it looks perfect and I will spend some time playing. I can quite see the problems over pre-computation, etc. but if the corpus was in good shape, it would only need doing once!Tim Hitchcockhttps://www.blogger.com/profile/17851547190864328027noreply@blogger.comtag:blogger.com,1999:blog-8929346053949579231.post-83640209659558981882012-02-08T13:31:13.857-05:002012-02-08T13:31:13.857-05:00Brett, Ted: Thanks.
Tim: Have you seen Ted's...Brett, Ted: Thanks. <br /><br />Tim: Have you seen Ted's <a href="http://leovip026.ncsa.uiuc.edu/Correlation/" rel="nofollow">Ngrams correlation viewer</a>? That does most, I think, of what you're suggesting. The big problem with correlation finders (both the Illinois and Google ones) is that you have to precompute an enormous amount of information (although the Google method seems to have some extremely clever work-arounds for dealing with huge sets.)<br /><br />If you extend to include not just words and phrases but <i>interactions</i> like those I'm looking at here, it would get completely out of control, I suspect.<br /><br />But there are also interesting similarities within each of these grids (certain ways of talking about slavery rise and fall together)--that, I think, can be a tremendously useful way of thinking about conceptual history; that's what got me into these things in general, actually.Benhttps://www.blogger.com/profile/04856020368342677253noreply@blogger.comtag:blogger.com,1999:blog-8929346053949579231.post-27268695545296744982012-02-08T04:38:39.287-05:002012-02-08T04:38:39.287-05:00Wonderful - and really powerful. It made me wonde...Wonderful - and really powerful. It made me wonder if you had considered how one might set this up with something that looks like a Google correlate environment (http://www.google.com/trends/correlate/). The correlate tool looks a bit useless to me at the moment, but within an established ngram set, it strikes that it would provide a fun and intuitive way to xplore linguistic relationships as an extension to a straight ngram system (though I don't see how one could incorporate distance measures within texts). It also seemed to me the sort of environment most historians could get their heads around.Tim Hitchcockhttps://www.blogger.com/profile/17851547190864328027noreply@blogger.comtag:blogger.com,1999:blog-8929346053949579231.post-61891323502873029432012-02-04T11:04:45.127-05:002012-02-04T11:04:45.127-05:00I second that. Brilliant approach to the problem o...I second that. Brilliant approach to the problem of mining 2-grams. I read this and immediately thought "I've got to get something like that set up at Illinois."<br /><br />I've actually done a bit of messing around with 2- and 3-grams in AWS but it is *not cheap*. Maybe it's cheap for people who have bigger research accounts, but I can't spend $100 every time I have a research question.Ted Underwoodhttps://www.blogger.com/profile/04012428899328561750noreply@blogger.comtag:blogger.com,1999:blog-8929346053949579231.post-30116101053665762772012-02-03T22:29:44.083-05:002012-02-03T22:29:44.083-05:00This is some terrific work, Ben. Thanks for sharin...This is some terrific work, Ben. Thanks for sharing it.Brett Bobleyhttps://www.blogger.com/profile/16289439108255824072noreply@blogger.com