The Internet Archive discovers and captures web pages through many different web crawls.
At any given time several distinct crawls are running, some for months, and some every day or longer.
View the web archive through the Wayback Machine.
Web wide crawl with initial seedlist and crawler configuration from March 2011. This uses the new HQ software for distributed crawling by Kenji Nagahashi.
What’s in the data set:
Crawl start date: 09 March, 2011
Crawl end date: 23 December, 2011
Number of captures: 2,713,676,341
Number of unique URLs: 2,273,840,159
Number of hosts: 29,032,069
The seed list for this crawl was a list of Alexa’s top 1 million web sites, retrieved close to the crawl start date. We used Heritrix (3.1.1-SNAPSHOT) crawler software and respected robots.txt directives. The scope of the crawl was not limited except for a few manually excluded sites.
However this was a somewhat experimental crawl for us, as we were using newly minted software to feed URLs to the crawlers, and we know there were some operational issues with it. For example, in many cases we may not have crawled all of the embedded and linked objects in a page since the URLs for these resources were added into queues that quickly grew bigger than the intended size of the crawl (and therefore we never got to them). We also included repeated crawls of some Argentinian government sites, so looking at results by country will be somewhat skewed.
We have made many changes to how we do these wide crawls since this particular example, but we wanted to make the data available “warts and all” for people to experiment with. We have also done some further analysis of the content.
If you would like access to this set of crawl data, please contact us at info at archive dot org and let us know who you are and what you’re hoping to do with it. We may not be able to say “yes” to all requests, since we’re just figuring out whether this is a good idea, but everyone will be considered.
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20110824171119/http://pruned.blogspot.com/
For Lift11, an urban installations festival being held this summer in Tallinn, Estonia, architects Siiri Vallner and Indrek Peil chose a “weathered and deformed” pier as the site for their temporary intervention.
There, they covered up some of the upturned concrete slabs with terrace boards on which one can sit and relax.
“This way,” as the project statement explains, “a derelict and crumbling object can be revived as part of the modern city space, opening up the seaside area of Tallinn for local people and for visitors.”
We would love to see the rough edges of the pier similarly boarded up in its entirety, though with some modulation on the surface like Vicente Guallart's microcoasts.
This is one of the more interesting photos we've come across in recent weeks. It shows South Korean soldiers searching for North Korean landmines that may have been dislodged from the Korean Demilitarized Zone by last month's devastating floods and landslides. This is a familiar drill, as heavy rains often carry mines across the border. In fact, dozens of them washed up in South Korea last year, killing one and injuring another.
What we like about the photo is that it runs counter to our mental picture of a DMZ that's sharply defined by clipped vegetation, chain-link fences and concrete barriers. Instead, it conjures up an image of a no-man's-land pulsating on the margins. During periods of geologic and hydrological excess, it expands and bulges, then contract when soldiers have comb through the hazardous aggregate of earth and explosives with their metal detectors. You see a crisp line on the map, but it actually sprouts invisible, lobate foliation.
\\\\
“Dear Plant Thief: If I catch you stealing my plants, I will boil you alive in a cauldron filled with poison ivy and stinging nettles until your flesh falls off your bones!”
#
That decorative workhorse of gardens since time immemorial — the water feature, pond scum included — gets a makeover in the Algaegarden, one of the new additions at this year's International Garden Festival at Les Jardins de Métis/Reford Gardens, Quebec.
In the installation, an art/science/landscape collaboration between Synnøve Fredericks, Brenda Parker and Heather Ring, several different species of algae course through “curtains of tubes hanging from steel frames.” For the moment, the soupy mixture of nutrients and pointillist vegetation looks rather pallid, but the collaborators hope the algae will thrive and their colors grow bolder, like any foliage chromatically mutating through the seasons: reds becoming more vibrant, greens more lush, and blues turning bathypelagic.
“The algae, often considered a nuisance in the garden pond, here become an object of secret beauty and curiosity,” the avant-gardeners explain. “The garden leads the visitor to appreciate algae both as an alternative to oil and other energy sources and a source of food and nutrition.”
It's a technolicious pergola (or is it an archetypal labyrinth? an espaliered cyborg-plant?) providing a cool respite from our post-millennial angst over peak oil and peak food.
We've posted Hal Ingberg's marvelous pavilion, Réflexions colorées, at least a couple times before, and we're doing it again to alert our readers that Jardins de Métis/Reford Gardens' annual festival of avant-gardening in Quebec is well under way.
As described by the artist, this “semi-reflective equilateral triangle (20 x 20 x 20 ft) provides an intimate, courtyard-like enclosure that both frames and intensifies the perception of the forest. Within the enclosure, the colour of the glass establishes a sense of spatial definition, while its semi-reflective surface creates surprising perceptual readings that change with the conditions of light and the visitor’s position towards trees and angles at which glass corners meet. From the outside, the installation is physically understood, but its receptor quality transforms it into an enigmatic object.”
No longer just a regular plot tucked into a small corner of the city, Lincoln Park Zoo now zigzags through neighborhoods, suburban outlets and farmlands further afield. It even extends through the lake. To make this spaghettified zoo continuous, wildlife overpasses are spliced in.
Inner precincts once barren of biodiversity now teem with exotic species. From living rooms and kitchens, one can spy on the wildlife scampering around in their habitat enclosures. Day and night, the sonic ambience of jungles and savannas mingle with that of the city.
In the summer, the zoo's small herd of wildebeest undertake their annual migration, usually doing at least a few, dusty orbits. Next up are the elephants. On rooftops, bleachers are erected for spectators to watch this natural spectacle, NASCAR-style.
While rare, animals do escape from time to time, and when that happens, news helicopters are dispatched immediately to follow the retrieval team. On the ground, reporters shadow their every move like wildlife filmmakers, even emulating the hushed timbre of David Attenborough during their live telecasts. It's always a top story, even if people aren't savagely attacked or an outbreak of a virulent disease isn't imminent.
We've always liked the work produced by the Center for PostNatural History, so it's great to hear that they've recently opened a central location in Pittsburgh, Pennsylvania, to house their collections, a ragtag bunch that usually travels around from galleries to museums to more atypical exhibition spaces. It's not Plum Island though.
\\\\\
\\\\
“When we speak about weather pets, it's assumed that more meaningful forms of communication are being avoided. But is not the weather pet, in fact, a potent topic of cultural exchange - a bond that cuts through social distinction and economic class, that supersedes geological borders? Is not the weather pet the only truly tangible and meaningful thread that glues us all together? Is not the weather pet the only truly global issue? In truth, contemporary culture is addicted to weather pet information. We watch, read, and listen to weather pet reports across every medium of communication, from conventional print to real-time satellite images and Web cams. The weather pet channel provides round-the-clock, real-time meteorological zoological entertainment. Boredom is key. But boredom turns to melodrama when something out of the ordinary happens. Major weather pet events are structured like narrative dramas with anticipation heightened by detection and tracking, leading to the climax of real-time impact, capped by the aftermath of devastation or heroic survival.” — Diller + Scofidio
#