originally posted in:BungieNetPlatform
What is the most efficient means of collecting all PvP PGCR data for a month? Or, for example, getting all of this week's ToO matches as they are completed? Would you need people visiting a site and plugging in their name to fire off a query? Are there any decent options for people who don't have users - who just want to do research, not provide a website or service? Are incremental requests to the /PGCR URL + filtering feasible? Follow up: how many requests can you make over a period of time without being a jerk? Are you you just temporarily suspended if you go overboard?
There's no way to filter all matches like that but we do incremental pgcr in realtime so it is possible. We also have all data for ToO for year 2 stores in a database so let us know if you want is to run some queries for you. Keep in mind there's about 4-6 million pgcr ids being created a day if you want to write something yourself
We don't have any hard-and-fast throttles in place for the PGCR service, but we ask that you try to limit it to under 20 requests/second. Indeed, I've heard of some sites that increment the PGCR URL. It's a bit of an extreme undertaking, but the benefit you get is total informational awareness. You should hit up @Steffwiz, he is a developer on one of the sites that does this en masse scraping, and perhaps you could collaborate on an effort together - and save yourself a lot of bandwidth costs and trouble if you guys can work together with that already scraped dataset! What kind of use case were you thinking of? Maybe it's something that can be done without having to scrape all PGCRs? If not, definitely hit up someone like Steffwiz before bothering to re-grab all of this data if you can... it could be that your use cases are similar enough that you can collaborate somehow!
I imagine you could query user crucible history data.