author: niplav, created: 2019-07-13, modified: 2020-10-13, language: english, status: in progress, importance: 4, confidence: log
The blog The Real Movement is about Marxism, more specifically about abolishing wage labor, the role of gold in labor analysis, and dunking on other Marxists. Its archives are awkward to navigate, therefore here a chronological listing.
This index currently lists 682 posts from 2013-05-20 to 2019-12-18.
The site was scraped using Python 2 with the libraries urllib2 and BeautifulSoup:
import urllib2
from bs4 import BeautifulSoup
import sys
import datetime
for year in range(2013, datetime.datetime.now().year+1):
yearposts=[]
for page in range(1, 1000):
url='https://therealmovement.wordpress.com/{}/page/{}'.format(year, page)
req=urllib2.Request(url, headers={'User-Agent' : "Magic Browser"})
try:
con=urllib2.urlopen(req)
except urllib2.HTTPError, e:
break
data=con.read()
soup=BeautifulSoup(data, 'html.parser')
posts=soup.find_all(class_="post")
for p in posts:
ts=p.find_all(class_="entry-title")
if ts==[]:
break
title=p.find_all(class_="entry-title")[0].a.text
link=p.find_all(class_="entry-title")[0].a.get('href')
date=p.h5.text[1:]
entry='* [{}]({}) (Jehu, {})'.format(title.encode('utf_8'), str(link), str(date))
yearposts.append(entry)
print('\n### {}\n'.format(year))
for t in reversed(yearposts):
print(t)