Abstract: We introduce an unsupervised pattern recognition algorithm termed the Discrete Shocklet Transform (DST) by which local dynamics of time series can be extracted. Time series that are hypothesized to be generated by underlying deterministic mechanisms have significantly different DSTs than do purely random null models. We apply the DST to a sociotechnical data source, usage frequencies for a subset of words on Twitter over a decade, and demonstrate the ability of the DST to filter high-dimensional data and automate the extraction of anomalous behavior.
Abstract: Advances in computing power, natural language processing, and digitization of text now make it possible to study our a culture's evolution through its texts using a "big data" lens. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories, forming patterns that are meaningful to us. Here, by classifying the emotional arcs for a filtered subset of 1,737 stories from Project Gutenberg's fiction collection, we find a set of six core trajectories which form the building blocks of complex narratives. We strengthen our findings by separately applying optimization, linear decomposition, supervised learning, and unsupervised learning. For each of these six core emotional arcs, we examine the closest characteristic stories in publication today and find that particular emotional arcs enjoy greater success, as measured by downloads.
Abstract: Sports are spontaneous generators of stories. Through skill and chance, the script of each game is dynamically written in real time by players acting out possible trajectories allowed by a sport's rules. By properly characterizing a given sport's ecology of `game stories', we are able to capture the sport's capacity for unfolding interesting narratives, in part by contrasting them with random walks. Here, we explore the game story space afforded by a data set of 1,310 Australian Football League (AFL) score lines. We find that AFL games exhibit a continuous spectrum of stories and show how coarse-graining reveals identifiable motifs ranging from last minute comeback wins to one-sided blowouts. Through an extensive comparison with a random walk null model, we show that AFL games are superdiffusive and deliver a much broader array of motifs, and we provide consequent insights into the narrative appeal of real games.