-
Is it possible to accumulate statistics from the buildbot runs? Specifically I thought it would be interesting to keep track of the size of our build, the time it takes to build, and the time to run the flunc tests. It wouldn't be particularly scientific, but it'd be interesting I think. We'd just have to write to some file outside of the build, I guess? Maybe one file per build, structured in some way to make it easy to extract and easy to add new statistics, and probably written so you can just append to the file to track something new. Maybe each thing would start with '>>variable_name:' (something relatively unique so that we can have multi-line values). Then we could do: STAT_FILE="/persistent-dir/stats/stat-$(date '+%Y-%m-%d-%s.txt')" record_stat () { echo -n ">>$1: " >> $STAT_FILE shift echo "$@" >> $STAT_FILE echo >> $STAT_FILE } record_stat 'start-time' $(date '+%s') pushd $BUILD_DIR record_stat 'du-sizes' $(du -s . *) popd ... record_stat 'end-time' $(date '+%s') And we'd have a record. Or maybe buildbot even has a convention for this already? Ian- Thread Outline:
-
On Wed, Apr 30, 2008 at 10:34:27AM -0500, Ian Bicking wrote: > Is it possible to accumulate statistics from the buildbot runs? Funny, just this morning I was thinking about accumulating some kind of history so you could track success/failure rate trends. > Specifically I thought it would be interesting to keep track of the size of > our build, the time it takes to build, and the time to run the flunc > tests. Start and stop times are already tracked by the buildmaster; the builder's steps each have a getTimes() method that returns a pair of timestamps. I'm not sure how useful times would be, since they vary quite a lot with system load. Probably the late-night scheduled builds would be most informative here. > It wouldn't be particularly scientific, but it'd be interesting I think. > We'd just have to write to some file outside of the build, I guess? Maybe > one file per build, structured in some way to make it easy to extract and > easy to add new statistics, and probably written so you can just append to > the file to track something new. > > Maybe each thing would start with '>>variable_name:' (something relatively > unique so that we can have multi-line values). Then we could do: > > STAT_FILE="/persistent-dir/stats/stat-$(date '+%Y-%m-%d-%s.txt')" > record_stat () { > echo -n ">>$1: " >> $STAT_FILE > shift > echo "$@" >> $STAT_FILE > echo >> $STAT_FILE > } > record_stat 'start-time' $(date '+%s') > pushd $BUILD_DIR > record_stat 'du-sizes' $(du -s . *) > popd > ... > record_stat 'end-time' $(date '+%s') > > And we'd have a record. Or maybe buildbot even has a convention for this > already? Doesn't seem to have one. However, it might not be hard to hack a builder shell command that includes arbitrary output like from du so we can at least track that one in the web display. I'll look into this. On a related note, there's an open buildbot ticket about putting status information into a sql database, which might make various kinds of ad-hoc queries easier. But it doesn't seem targeted at storing arbitrary data like this. http://buildbot.net/trac/ticket/24 On another related note, I've been thinging about writing a little buildbot meta-master web app that would aggregate information from multiple buildmasters. My reason for wanting this is that buildbot is really designed with the idea that a single buildmaster is dedicated to one "project" that lives in a single source repository, but A) some things we want to build ("openplans") are in multiple parts of our svn tree, and B) I really want a single page that summarizes builds of multiple projects. Buildbot can't really do that with a single buildmaster. I talked to Brian Warner about this briefly at Pycon, who confirmed that it would be pretty hard to change Buildbot to support multiple projects, because the single-project assumption is all over the place; e.g. the config file is assumed to contain a single BuildmasterConfig dictionary which tracks only a single change source. (I've already got some ugly change-filtering going on to enable builds to be triggered by changes in multiple parts of our repository, and there are things I just can't do with this approach, like have a change to sputnik trigger a livablestreets build but NOT a vanilla opencore build.) Brian confirmed that the thing to do is run multiple buildmasters. But I still want that single summary view, so I think that'll have to be a new app. Either a new twisted app that runs multiple masters in-process, or a separate process that talks to buildmasters via xmlrpc or http. -- Paul Winkler http://www.openplans.org/people/slinkp/profile yahoo: slinkp23 AIM: slinkp1970-
Paul Winkler wrote: > On Wed, Apr 30, 2008 at 10:34:27AM -0500, Ian Bicking wrote: >> Is it possible to accumulate statistics from the buildbot runs? > > Funny, just this morning I was thinking about accumulating some kind > of history so you could track success/failure rate trends. > >> Specifically I thought it would be interesting to keep track of the size of >> our build, the time it takes to build, and the time to run the flunc >> tests. > > Start and stop times are already tracked by the buildmaster; the > builder's steps each have a getTimes() method that returns a pair of > timestamps. > > I'm not sure how useful times would be, since they vary quite a lot > with system load. Probably the late-night scheduled builds would be > most informative here. Yes, being able to filter out builds based on time of day might be useful. There will also be a certain amount of noise in anything time-related, but we could still see trends as a result. >> It wouldn't be particularly scientific, but it'd be interesting I think. >> We'd just have to write to some file outside of the build, I guess? Maybe >> one file per build, structured in some way to make it easy to extract and >> easy to add new statistics, and probably written so you can just append to >> the file to track something new. >> >> Maybe each thing would start with '>>variable_name:' (something relatively >> unique so that we can have multi-line values). Then we could do: >> >> STAT_FILE="/persistent-dir/stats/stat-$(date '+%Y-%m-%d-%s.txt')" >> record_stat () { >> echo -n ">>$1: " >> $STAT_FILE >> shift >> echo "$@" >> $STAT_FILE >> echo >> $STAT_FILE >> } >> record_stat 'start-time' $(date '+%s') >> pushd $BUILD_DIR >> record_stat 'du-sizes' $(du -s . *) >> popd >> ... >> record_stat 'end-time' $(date '+%s') >> >> And we'd have a record. Or maybe buildbot even has a convention for this >> already? > > Doesn't seem to have one. However, it might not be hard to hack a > builder shell command that includes arbitrary output like from du so > we can at least track that one in the web display. I'll look into > this. I don't know if it'll be that interesting. It might be interesting. If we have a basic sort of structure, then saving the information won't be very hard, and we can determine how interesting it is later, and ignore it if it isn't interesting. > On a related note, there's an open buildbot ticket about putting > status information into a sql database, which might make various kinds > of ad-hoc queries easier. But it doesn't seem targeted at storing > arbitrary data like this. > http://buildbot.net/trac/ticket/24 I would guess that parsing a thousand files and collating the information wouldn't be all that slow. If the records exist, then we can figure out if there's a useful way to interpret them. Like, an interesting record would be the length of easy-install.pth, which is about how many libraries are involved in a package. But I don't want to think about it much right *now* -- in a few month it'd be interesting to look at. If I could just add this to a shell file, that'd be useful: echo ">>easy-install-length:" >> $STAT_FILE for EASY in */lib/python*/site-packages/easy-install.pth ; do echo $EASY $(wc -l $EASY) >> $STAT_FILE done Quite possibly analysis would most easily be done by putting it into a database, but probably only after the fact. Ian-
On Wed, Apr 30, 2008 at 02:37:53PM -0500, Ian Bicking wrote: > I would guess that parsing a thousand files and collating the information > wouldn't be all that slow. If the records exist, then we can figure out if > there's a useful way to interpret them. One thing that already exists is pickled status results from each build. The master keeps these around forever (they're used for the web UI). These include each step's start and end times, blamelist, and a few other interesting things. For example, here's a little poking around, edited: $ cd buildmaster/livable-full $ python >>> import buildbot >>> import pickle >>> status = pickle.loads(open('42').read()) >>> status.blamelist [u'pw'] >>> status.number 42 >>> status.isFinished() True >>> status.steps [<buildbot.status.builder.BuildStepStatus instance at 0x2b444cb02e18>, ... ] >>> flunc = status.steps[-2] >>> flunc.name 'shell_11' >>> flunc.getText() ['run flunc tests', 'failed'] >>> flunc.getTimes() (1207959887.425462, 1207960088.139046) >>> flunc.getLogs() [<buildbot.status.builder.LogFile instance at 0x2b444cb0cbd8>] >>> flunc.logs[0].filename '42-log-shell_11-stdio' That last one tells us the name of a file where the text from the buildslave's step was saved locally by the master. >>> fluncout = open(flunc.logs[0].filename).read() >>> print fluncout ... [*] running test: edit_project ==> at http://localhost:7200/projects/testhaven/project-home ... So, it'd be pretty straightforward to write little scripts that open these pickles up, extract useful data, parse the stdio logs, etc. We could use your idea of marking lines of interest with the existing stdio logs on the buildmaster. Then I don't have to write any code to make this work. Might want to have an ending marker too. And I'd like markers that don't visually resemble either shell redirection or a python prompt/doctest. How bout something like ... oh i dunno, <STAT></STAT> ? > Like, an interesting record would be the length of easy-install.pth, which > is about how many libraries are involved in a package. But I don't want to > think about it much right *now* -- in a few month it'd be interesting to > look at. If I could just add this to a shell file, that'd be useful: > > echo ">>easy-install-length:" >> $STAT_FILE > for EASY in */lib/python*/site-packages/easy-install.pth ; do > echo $EASY $(wc -l $EASY) >> $STAT_FILE > done Okay. Running arbitrary shell scripts on the slave has so far been kind of inconvenient, because it's designed around running single commands; so the slave either has to check out the script to run, or the master has to use a FileDownload command to send it along. Both of those seem a little inconvenient for running a tiny script that I'd prefer to just keep in the master config. So I've just added a ToppShellScript class to the master, which allows me to do things like: script = """ echo $1 | grep $2 """ buildfactory.addStep(ToppShellScript( sourcetext=script, scriptargs='"Oh yay I hope this works" "yay"', description=...)) Builds are running now, if I haven't broken anything I'll try your easy-install-length script tomorrow. Yep, they passed. -- Paul Winkler http://www.openplans.org/people/slinkp/profile yahoo: slinkp23 AIM: slinkp1970 -
(cleaning out my mailbox) I've decided that I'm not going to try to hack stats-gathering into Buildbot unless/until I get some time to investigate the possibility of switching from Buildbot to Bitten, which is designed for exactly this purpose. (Ethan pointed it out to me; he and Jeff and I had a brief off-list discussion about it a while back and agreed we should look into it "when we have time.") From http://bitten.edgewall.org/wiki/WhitePaper : """"The goal of this work is to design and implement a distributed system for automated builds and continuous integration that allows the central collection and storage of software metrics generated during the build.""" Here's an example: http://bitten.edgewall.org/build/trunk - PW On Wed, Apr 30, 2008 at 02:37:53PM -0500, Ian Bicking wrote: > Paul Winkler wrote: >> On Wed, Apr 30, 2008 at 10:34:27AM -0500, Ian Bicking wrote: >>> Is it possible to accumulate statistics from the buildbot runs? >> Funny, just this morning I was thinking about accumulating some kind >> of history so you could track success/failure rate trends. >>> Specifically I thought it would be interesting to keep track of the size >>> of our build, the time it takes to build, and the time to run the flunc >>> tests. >> Start and stop times are already tracked by the buildmaster; the >> builder's steps each have a getTimes() method that returns a pair of >> timestamps. >> I'm not sure how useful times would be, since they vary quite a lot >> with system load. Probably the late-night scheduled builds would be >> most informative here. > > Yes, being able to filter out builds based on time of day might be useful. > There will also be a certain amount of noise in anything time-related, but > we could still see trends as a result. > >>> It wouldn't be particularly scientific, but it'd be interesting I think. >>> We'd just have to write to some file outside of the build, I guess? >>> Maybe one file per build, structured in some way to make it easy to >>> extract and easy to add new statistics, and probably written so you can >>> just append to the file to track something new. >>> >>> Maybe each thing would start with '>>variable_name:' (something >>> relatively unique so that we can have multi-line values). Then we could >>> do: >>> >>> STAT_FILE="/persistent-dir/stats/stat-$(date '+%Y-%m-%d-%s.txt')" >>> record_stat () { >>> echo -n ">>$1: " >> $STAT_FILE >>> shift >>> echo "$@" >> $STAT_FILE >>> echo >> $STAT_FILE >>> } >>> record_stat 'start-time' $(date '+%s') >>> pushd $BUILD_DIR >>> record_stat 'du-sizes' $(du -s . *) >>> popd >>> ... >>> record_stat 'end-time' $(date '+%s') >>> >>> And we'd have a record. Or maybe buildbot even has a convention for this >>> already? >> Doesn't seem to have one. However, it might not be hard to hack a >> builder shell command that includes arbitrary output like from du so >> we can at least track that one in the web display. I'll look into >> this. > > I don't know if it'll be that interesting. It might be interesting. If we > have a basic sort of structure, then saving the information won't be very > hard, and we can determine how interesting it is later, and ignore it if it > isn't interesting. > >> On a related note, there's an open buildbot ticket about putting >> status information into a sql database, which might make various kinds >> of ad-hoc queries easier. But it doesn't seem targeted at storing >> arbitrary data like this. >> http://buildbot.net/trac/ticket/24 > > I would guess that parsing a thousand files and collating the information > wouldn't be all that slow. If the records exist, then we can figure out if > there's a useful way to interpret them. > > Like, an interesting record would be the length of easy-install.pth, which > is about how many libraries are involved in a package. But I don't want to > think about it much right *now* -- in a few month it'd be interesting to > look at. If I could just add this to a shell file, that'd be useful: > > echo ">>easy-install-length:" >> $STAT_FILE > for EASY in */lib/python*/site-packages/easy-install.pth ; do > echo $EASY $(wc -l $EASY) >> $STAT_FILE > done > > Quite possibly analysis would most easily be done by putting it into a > database, but probably only after the fact. > > Ian > > !DSPAM:4043,4818ca8d127682458217002! > -- Paul Winkler http://www.openplans.org/people/slinkp/profile yahoo: slinkp23 AIM: slinkp1970
-
-