ORG-tech-vols IRC meeting log 2014 04 08

(20:29:43) Allll: hi all
(20:29:55) graphiclunarkid: Evening everyone
(20:31:02) graphiclunarkid: Hey Allll, how's it going?
(20:31:15) Allll: good thanks, how's the new place?
(20:31:25) graphiclunarkid: Evening dantheta, _46bit_, _guy, hellais, plett, xarlekino :)
(20:31:39) graphiclunarkid: Allll: Pretty good thanks. Mostly settled in now!
(20:32:19) graphiclunarkid: All our stuff is in storage though - we have to find a new place to rent in Norway by the end of May. Then we can send for it all.
(20:33:03) Allll: got to admit I'm pretty jealous it was a nice place even in the middle of winter. It's amazing how much stuff you realise you don't need at times like this though
(20:33:26) graphiclunarkid: Allll: Yeah! You wouldn't believe the amount of stuff we got rid off!
(20:33:50) graphiclunarkid: If you're interested in the details: https://richardskingdom.net/moving-norway
(20:34:15) Allll: I'll check it out, hopefully you'll see the lights before it's too late that was amazing
(20:34:39) dantheta: Hey allT
(20:34:40) Allll: that's some impressive packing
(20:34:45) graphiclunarkid: Hey dantheta :)
(20:34:53) graphiclunarkid: Allll: Yeah - it was a really close-run thing!
(20:35:08) graphiclunarkid: Allll: I didn't realise you'd been to Tromso - when were you here?
(20:35:38) plett: graphiclunarkid: Hi
(20:35:48) graphiclunarkid: Evening plett - how's it going?
(20:35:51) Allll: January this year, just for a few days to see the lights, we were pretty lucky with that
(20:36:16) graphiclunarkid: Allll: Amazing! I was here over Christmas too. Saw a brilliant display of northern lights on new year's day :)
(20:36:58) plett: graphiclunarkid: Pretty good. The first two of the DSL lines at the A&A offices have gone in so far, the rest should be there by the end of the week
(20:37:14) graphiclunarkid: plett: That's great news!
(20:37:25) dantheta: Cool!
(20:37:43) graphiclunarkid: plett: Vasilis will be pleased - he'll finally have some actually censored lines to run ooniprobes against :)
(20:38:13) graphiclunarkid: (The trouble with volunteer anti-censorship campaigners is that their internet connections all work properly!)
(20:38:57) graphiclunarkid: Allll: I saw a thread about problems with the staging server not recording things in its database correctly. Did the suggestions Matt gave bear any fruit?
(20:39:08) dantheta: This is true. My poor PAYG vodafone sim will be glad to have the rest
(20:39:50) graphiclunarkid: dantheta: Indeed. Thanks for the testing with that though! I'm pretty excited about how far you've got now with integrating the middleware stuff :)
(20:39:54) Allll: He was looking at it, but don't think it's been sorted yet - looks like some kind of setup issue though as as far as I can tell it should all be working fine
(20:40:21) dantheta: I might be able to help with that - I have database root
(20:40:29) dantheta: If the staging box is dev-censor-1, that is
(20:40:35) graphiclunarkid: Allll: If it would help I can give you access to the server itself.
(20:40:36) plett: graphiclunarkid: we should have VMs ready to be used early next week. I guess at that point we need to decide on the technicalities of who gets logins and by what means
(20:40:47) Allll: http://stage.blocked.org.uk/getinvolved.html
(20:41:01) graphiclunarkid: plett: Yeah - probably worth starting a thread on the mailing list to discuss it?
(20:41:03) Allll: is that the same server dan?
(20:41:11) dantheta: Yep, same IP
(20:41:41) Allll: @glk that may be useful, I won't be able to do much till May now, I'm trying to finish up the layout of the pages and all HTML stuff now so that that's all ready to go
(20:41:57) graphiclunarkid: plett: With dev-censor-1 we've granted ssh access to a couple of people who are contributing actively. dantheta, Gareth (who we haven't seen for a while) and myself I think.
(20:42:21) graphiclunarkid: Allll: Yeah, I saw you were going to be unavailable. Going away for Easter?
(20:42:42) Allll: yeah, off for a couple of weeks or so :D
(20:43:05) graphiclunarkid: Allll: Cool! Is there anything I or others can pick up while you're not here? If so, just yell on the list or raise github issues, and I'll see who else is available (or do them myself).
(20:43:09) graphiclunarkid: (If I can!)
(20:43:32) plett: graphiclunarkid: I would be happy for all those people to have access to our VMs if need be.
(20:43:33) Allll: As it stands at the moment I'm editing the raw HTML (can see the latest here: https://dl.dropboxusercontent.com/u/12755204/blocked-org-uk/raw_html/index.html) and I'll get that into ModX before I go - you can see it more or less
(20:43:55) Allll: before I go I'll send my To do list to the mailing list so everyone can see where it stands
(20:44:58) Allll: The main issue is going to be integrating the API, ModX has been interesting to learn had a few issues that slowed me down, but regarding getting the data across I think we need to work out the best way to do that
(20:45:36) Allll: The best bet to get something working as quick as possible would be to get a cron job pulling the data out of the FormSave table in MySQL that ModX should be saving to
(20:45:44) Allll: and then sending that over
(20:45:56) plett: graphiclunarkid: From A&A's point of view, the ideal solution would be for "the ORG" (whatever that actually means in this context) to decide who needs access and give us a list of account names and ssh keys to be put on the VMs
(20:46:09) graphiclunarkid: plett: I expect you'll want someone at your end to administrate the box - we can send them SSH keys for user accounts. Also happy to share the load on admin duties if you want to reduce time overheads. Up to you - it's your network and hardware!
(20:46:24) graphiclunarkid: plett: Great minds type alike! Yes, we can do that.
(20:46:41) dantheta: Sysadmin by day, happy to help out too.
(20:46:43) Allll: in the future I'd really like to get it done in JS and have something a lot better in place, but it may prove tricky trying to save the data to ModX as well as pushing to API - but sure we can work it out in time
(20:46:46) graphiclunarkid: Allll: Cool - yes, I think integration is the next big challenge.
(20:47:12) dantheta: I'm not 100% sure that we would want to call the middleware API directly from client-side javascript yet
(20:47:19) graphiclunarkid: Allll: We can always iterate later. Getting the first version up and running will be a major milestone though.
(20:48:01) dantheta: THe authentication works on a secret string shared between the API and a client - the admin user's secret key would wind up in client-side JS
(20:48:03) Allll: I did have a fairly quick look at the API a week or two ago but couldn't quite see where to send the data, but like you say we can do that after we have something running
(20:48:34) plett: graphiclunarkid: We would prefer to manage the VMs ourselves in terms of installing updates etc, since they're on our network. We should be able to put them into the same puppet group that our servers/desktops are managed with.
(20:48:41) Allll: @dan cool do you reckon a cron to pull out the latest data from a modx db is the best way to proceed - I'm open to sugfgestions
(20:49:00) graphiclunarkid: plett: Sure, that's entirely reasonable!
(20:49:06) Allll: or suggestions even!
(20:50:40) graphiclunarkid: Allll: dantheta: All this is on the same server at the moment, but when we launch the site it'll move over to ORG's main MODx instance, so they'll then be on different machines.
(20:50:41) dantheta: The probes on the VMs will be polling the API for test urls. If you get really stuck submitting URLs, we can arrange for a jobs on the modx server to push URLs across to the API's database (since the API database and the modX database are on the same box)
(20:51:02) dantheta: Sorry, my connection is a little laggy
(20:51:17) dantheta: I'm using freenode webui
(20:51:25) dantheta: On hotel wifi
(20:52:01) Allll: @dan I think that will be best, I've not used modx before and I'd need to create a plugin to handle that - I reckon something to pull the data between dbs would be easier by far
(20:52:43) Allll: @glk could be trickier, but I guess a job running in the background could use the key without the same issues as via JS?
(20:52:59) dantheta: That's right - the key would stay on the server.
(20:53:06) graphiclunarkid: Allll, dantheta: sounds sensible. We're only going to have one source of URLs - the website - so a DB sync is as good as an API call right now.
(20:53:15) dantheta: There's still a similar issue with pulling the json objects for test results
(20:53:16) graphiclunarkid: Allll: Yeah, we can make it a background process.
(20:53:46) Allll: @glk cool
(20:54:09) dantheta: Retrieving data is a good bit easier to work around though.
(20:54:14) graphiclunarkid: Allll: dantheta: a possible point to consider is how quickly results would get returned. If we were synchronising databases periodically we're pretty much resigning ourselves to "request now, email results later" land.
(20:54:42) graphiclunarkid: That would be a good incentive to iterate towards a more real-time API call though :)
(20:54:46) Allll: @dan do you have any sugestions for returning results - I guess we could do a db sync and then I could just right a PHP page to spit out some JSON and use JS to format that - would be easier than trying to pull it through modx I suspect
(20:54:49) dantheta: I've been thinking that I'd like to investigate using a message-queue for scalability
(20:55:41) Allll: @glk I was expected it to be a delayed thing, but it could show previous result if a site had been checked before.
(20:56:00) dantheta: Allll: I think that's pretty much how I was thinking.
(20:56:43) dantheta: DB sync is only needed as a workaround for submitting URLs. Retrieving results is just a json call to a PHP script that has the user credentials and passes the json straight on
(20:56:46) graphiclunarkid: Yeah, I think that's what we'll be doing at first, though I reckon it'd be nice if you could see the results coming back to you while you're on the same page. Later though!
(20:57:06) Allll: a quick hack would be to have the job in a php script and just call that from some JS when the form is submitted that still wouldn't return the results instantly though
(20:58:00) Allll: @dan in that case if we had a PHP script on the server (nothing to do with modX) that could spit out the results in JSON
(20:58:10) graphiclunarkid: Yeah, it's never going to be instant. It might be quite quick to go website -> queue -> A&A probes -> results -> database -> refresh though. But I'm getting ahead of myself.
(20:58:14) dantheta: Allll: That's right
(20:58:14) ***graphiclunarkid stops getting excited...
(20:58:21) Allll: cool :)
(20:58:51) dantheta: glk: the excitedness isn't unwarranted - I'm pretty handy with AMQP.
(20:59:08) graphiclunarkid: dantheta: \o/
(20:59:42) graphiclunarkid: Allll, dantheta: I spent some time reading up about ooniprobe. I'm a bit worried our system and theirs have diverged a bit in terms of how URLs are passed out and results returned.
(21:00:12) graphiclunarkid: ooniprobe doesn't have an API yet - but the one they've designed treats the probe as the API server and the back-end calls methods on it to send new URLs, start tests, and stop tests.
(21:00:21) dantheta: glk: I'm not really sure how similar they were in the past ...
(21:00:22) graphiclunarkid: That's backwards to what we have so far, isn't it?
(21:00:31) graphiclunarkid: dantheta: Good point!
(21:01:50) graphiclunarkid: Still, if we're going to integrate ooniprobes as we've planned, we're going to have to tackle that at some point.
(21:01:56) dantheta: glk: When I started to do bits of maintenance on the API, I was following the model that NetworkString adopted with the android probes, which didn't relate too much to ooni. The messaging and results were all custom.
(21:02:26) graphiclunarkid: dantheta: Indeed. The android probe != ooniprobe and never has done.
(21:03:05) dantheta: I'm wondering if we should feed both the same URL queue in their own way, have then report results in their own way (ooni -> ooni-backend, android -> middleware) and join the two together at the results DB level
(21:03:47) graphiclunarkid: dantheta: Yes, that's a possibility.
(21:04:08) dantheta: ooni's big advantage over the android probe (and others) is that it can work out for itself the means and method that any ISP has used to block a particular URL. The android probe (and other probes in that fashion) need to be given rules on a per-isp basis
(21:04:16) dantheta: Or that's the pattern that's seemed to emerge
(21:04:20) dantheta: at least
(21:05:10) graphiclunarkid: dantheta: OONI is quite simple in that respect. It fetches the URL over HTTP and again over Tor then compares the two. If they're more than a little bit different it crys foul.
(21:05:50) graphiclunarkid: (It allows for a fudge-factor to account for things like CDNs serving subtly different content based on IP geolocation; different advertising banners; etc)
(21:05:59) dantheta: While the custom probes might give us a fast yes/no on a particular ISP/URL combination (which is what we want for the website), ooni will be the more future-proof option
(21:06:42) graphiclunarkid: dantheta: Does the android probe actually poll a queue? Or does the middleware push new URLs out to registered probes?
(21:07:48) dantheta: The android probe uses Google Cloud Messaging to be made aware that there are queued URLs available (as I understand it). It's a mobile power-saving optimization which also gets down unnecessary polling and long waits.
(21:07:59) dantheta: s/gets/cuts
(21:08:16) graphiclunarkid: Ah - I see. I wondered what GCM was doing if there was polling going on! That explains it.
(21:09:18) graphiclunarkid: Well, there would be nothing to stop us from forking ooniprobe and implementing a similar polling mechanism, though I doubt it would be accepted upstream as they already have a different API design.
(21:09:39) dantheta: I was actually thinking of wrapping ooniprobe instead of forking.
(21:09:50) graphiclunarkid: Or we could implement some kind of a broker that polls for new URLs then pushes them to ooniprobes.
(21:10:04) graphiclunarkid: dantheta: Yeah, or wrapping it locally, as you say.
(21:10:11) dantheta: It already has a command-line mode to run over a batch of files. A simple shell/php/python script that pulls <n> urls from the API, writes to a file, then runs ooniprobe on that file
(21:10:25) graphiclunarkid: dantheta: Actually that's probably what we'll need to do in the short term .... as you have just pointed out!
(21:10:50) graphiclunarkid: That would get us our initial working version. We can think about how to improve it after that.
(21:11:59) dantheta: I'm still trying out RasPi images as Vasilis sends them. When we've got a really stable build I can stick a set of API creds and a polling script on them.
(21:12:38) graphiclunarkid: dantheta: Cool - is he pushing those out fairly actively then?
(21:12:49) dantheta: There's still the Alexa top 10^n to run
(21:13:30) graphiclunarkid: dantheta: Yeah - I think vasilis is keen to establish a baseline report using that list. I expect it'll be the first job we give to plett's probes at A&A so we can have the info for all the major ISPs.
(21:13:47) dantheta: The last image I tried was the weekend before last, I think. Unfortunately I had a busy week (getting ready for a holiday) and didn't have a chance to try it until later in the week
(21:14:18) graphiclunarkid: dantheta: Understandable. Good to know things are progressing on that front too is all :)
(21:14:30) graphiclunarkid: dantheta: Going away for easter too then, I take it?
(21:14:43) dantheta: I'm currently in the Czech Republic :)
(21:15:07) dantheta: My hotel blocks IRC
(21:15:11) graphiclunarkid: dantheta: Wow - awesome! Allll is also off for a few weeks shortly. I'm starting to get jealous of you all - except that I just moved countries too :)
(21:15:25) graphiclunarkid: dantheta: Not very well, clearly ;)
(21:15:39) dantheta: hehe :)
(21:16:07) Allll: :0
(21:16:30) dantheta: As a little diversion before leaving, I wrote a noddy half-probe in Go, implementing the URL list and results submit API endpoints
(21:16:52) dantheta: I had that running on the Vodafone device for a week before it ran out of credit
(21:17:13) graphiclunarkid: dantheta: I wondered what you meant by that comment on the list. Sounds interesting!
(21:17:43) graphiclunarkid: dantheta: If you want an ORG repo for it just let me know ;-)
(21:17:59) dantheta: It will probably never be ready for primetime, and I'm hoping that Kori / other volunteer will pick up the Java library and do a proper one!
(21:18:13) dantheta: Vodafone's "your url is blocked" page has invalid headers that break Go's standard library
(21:18:23) dantheta: I had to hack up the stdlib to get it to run
(21:18:55) dantheta: very nasty
(21:19:35) graphiclunarkid: dantheta: At least that'll make it easier to spot! I read that one network has started just returning a "page broken" response rather than a custom "we blocked it" message. That'll make things more difficult.
(21:20:01) graphiclunarkid: I guess we should wrap up soon. Allll, dantheta, plett: you should all feel free to spam me with to-do lists in the run up to Easter. I'm officially working for ORG two days per week now. I'm working Thursday & Friday this week; Monday & Tuesday next week so let me know what you need and I'll get it done.
(21:20:26) graphiclunarkid: (I'm away Maundy Thursday through Easter Monday though)
(21:21:02) graphiclunarkid: I am also going to spend some time hacking on this over the weekend. If anyone fancies getting together for a bit of a virtual hackathon please let me know! I'll suggest this on the list too.
(21:21:47) Allll: Cool, if anyone has any comments on design then let me know/stick it in github - latest is at https://dl.dropboxusercontent.com/u/12755204/blocked-org-uk/raw_html/index.html and once in modx here: http://stage.blocked.org.uk/index.html
(21:22:41) graphiclunarkid: Watch out for a blog post on ORG's main blog this week describing what we've been up to and asking more people to get involved too.
(21:22:43) dantheta: When you've got form URL submission into Modx working, ping me and I'll whip something together for populating the API queue
(21:23:08) graphiclunarkid: Allll: If you can send me an SSH public key I'll give you an account on dev-censor-1.
(21:23:29) Allll: @dan will do, though I think someone else will need to fix it
(21:23:29) graphiclunarkid: You might not get much chance to make use of it before you go away but at least it'll be there if you do need it.
(21:23:39) Allll: @glk ok will send it over shortly
(21:23:49) graphiclunarkid: I will also take a look at fixing that modx database thing - in so far as I am able.
(21:24:01) graphiclunarkid: Great.
(21:24:32) graphiclunarkid: Well my dinner is about to burn so I'm going to shoot off and rescue it now. I'll be hanging out in the channel for a while yet if anyone needs me though.
(21:24:46) Allll: cya
(21:24:56) graphiclunarkid: Thanks for coming along Allll, plett, dantheta. See you around :)
(21:25:31) dantheta: Indeed - have a good one all.