• OpenCore Dev

  • buildbot stats

    from ianb on Apr 30, 2008 11:34 AM
    Is it possible to accumulate statistics from the buildbot runs? 
    Specifically I thought it would be interesting to keep track of the size 
    of our build, the time it takes to build, and the time to run the flunc 
    tests.  It wouldn't be particularly scientific, but it'd be interesting 
    I think.  We'd just have to write to some file outside of the build, I 
    guess?  Maybe one file per build, structured in some way to make it easy 
    to extract and easy to add new statistics, and probably written so you 
    can just append to the file to track something new.
    
    Maybe each thing would start with '>>variable_name:' (something 
    relatively unique so that we can have multi-line values).  Then we could do:
    
    STAT_FILE="/persistent-dir/stats/stat-$(date '+%Y-%m-%d-%s.txt')"
    record_stat () {
         echo -n ">>$1: " >> $STAT_FILE
         shift
         echo "$@" >> $STAT_FILE
         echo >> $STAT_FILE
    }
    record_stat 'start-time' $(date '+%s')
    pushd $BUILD_DIR
    record_stat 'du-sizes' $(du -s . *)
    popd
    ...
    record_stat 'end-time' $(date '+%s')
    
    And we'd have a record.  Or maybe buildbot even has a convention for 
    this already?
    
       Ian
    
    Thread Outline:
  • Re: buildbot stats

    from slinkp on Apr 30, 2008 02:58 PM
    On Wed, Apr 30, 2008 at 10:34:27AM -0500, Ian Bicking wrote:
    > Is it possible to accumulate statistics from the buildbot runs? 
    
    Funny, just this morning I was thinking about accumulating some kind
    of history so you could track success/failure rate trends.
    
    > Specifically I thought it would be interesting to keep track of the size of 
    > our build, the time it takes to build, and the time to run the flunc
    > tests. 
    
    Start and stop times are already tracked by the buildmaster; the
    builder's steps each have a getTimes() method that returns a pair of
    timestamps.
    
    I'm not sure how useful times would be, since they vary quite a lot
    with system load.  Probably the late-night scheduled builds would be
    most informative here.
    
    >  It wouldn't be particularly scientific, but it'd be interesting I think.  
    > We'd just have to write to some file outside of the build, I guess?  Maybe 
    > one file per build, structured in some way to make it easy to extract and 
    > easy to add new statistics, and probably written so you can just append to 
    > the file to track something new.
    >
    > Maybe each thing would start with '>>variable_name:' (something relatively 
    > unique so that we can have multi-line values).  Then we could do:
    >
    > STAT_FILE="/persistent-dir/stats/stat-$(date '+%Y-%m-%d-%s.txt')"
    > record_stat () {
    >     echo -n ">>$1: " >> $STAT_FILE
    >     shift
    >     echo "$@" >> $STAT_FILE
    >     echo >> $STAT_FILE
    > }
    > record_stat 'start-time' $(date '+%s')
    > pushd $BUILD_DIR
    > record_stat 'du-sizes' $(du -s . *)
    > popd
    > ...
    > record_stat 'end-time' $(date '+%s')
    >
    > And we'd have a record.  Or maybe buildbot even has a convention for this 
    > already?
    
    Doesn't seem to have one.  However, it might not be hard to hack a
    builder shell command that includes arbitrary output like from du so
    we can at least track that one in the web display. I'll look into
    this.
    
    On a related note, there's an open buildbot ticket about putting
    status information into a sql database, which might make various kinds
    of ad-hoc queries easier. But it doesn't seem targeted at storing
    arbitrary data like this.
    http://buildbot.net/trac/ticket/24
    
    On another related note, I've been thinging about writing a little
    buildbot meta-master web app that would aggregate information from
    multiple buildmasters.
    
    My reason for wanting this is that buildbot is really designed with
    the idea that a single buildmaster is dedicated to one "project" that
    lives in a single source repository, but A) some things we want to
    build ("openplans") are in multiple parts of our svn tree, and B) I
    really want a single page that summarizes builds of multiple projects.
    Buildbot can't really do that with a single buildmaster.  I talked to
    Brian Warner about this briefly at Pycon, who confirmed that it would
    be pretty hard to change Buildbot to support multiple projects,
    because the single-project assumption is all over the place; e.g. the
    config file is assumed to contain a single BuildmasterConfig
    dictionary which tracks only a single change source. (I've already got
    some ugly change-filtering going on to enable builds to be triggered
    by changes in multiple parts of our repository, and there are things I
    just can't do with this approach, like have a change to sputnik
    trigger a livablestreets build but NOT a vanilla opencore build.)
    
    Brian confirmed that the thing to do is run multiple buildmasters. But
    I still want that single summary view, so I think that'll have to be a
    new app. Either a new twisted app that runs multiple masters
    in-process, or a separate process that talks to buildmasters via
    xmlrpc or http.
    
    -- 
    
    Paul Winkler
    http://www.openplans.org/people/slinkp/profile
    yahoo: slinkp23
    AIM:   slinkp1970
    
    • Re: buildbot stats

      from ianb on Apr 30, 2008 03:37 PM
      Paul Winkler wrote:
      > On Wed, Apr 30, 2008 at 10:34:27AM -0500, Ian Bicking wrote:
      >> Is it possible to accumulate statistics from the buildbot runs? 
      > 
      > Funny, just this morning I was thinking about accumulating some kind
      > of history so you could track success/failure rate trends.
      > 
      >> Specifically I thought it would be interesting to keep track of the size of 
      >> our build, the time it takes to build, and the time to run the flunc
      >> tests. 
      > 
      > Start and stop times are already tracked by the buildmaster; the
      > builder's steps each have a getTimes() method that returns a pair of
      > timestamps.
      > 
      > I'm not sure how useful times would be, since they vary quite a lot
      > with system load.  Probably the late-night scheduled builds would be
      > most informative here.
      
      Yes, being able to filter out builds based on time of day might be 
      useful.  There will also be a certain amount of noise in anything 
      time-related, but we could still see trends as a result.
      
      >>  It wouldn't be particularly scientific, but it'd be interesting I think.  
      >> We'd just have to write to some file outside of the build, I guess?  Maybe 
      >> one file per build, structured in some way to make it easy to extract and 
      >> easy to add new statistics, and probably written so you can just append to 
      >> the file to track something new.
      >>
      >> Maybe each thing would start with '>>variable_name:' (something relatively 
      >> unique so that we can have multi-line values).  Then we could do:
      >>
      >> STAT_FILE="/persistent-dir/stats/stat-$(date '+%Y-%m-%d-%s.txt')"
      >> record_stat () {
      >>     echo -n ">>$1: " >> $STAT_FILE
      >>     shift
      >>     echo "$@" >> $STAT_FILE
      >>     echo >> $STAT_FILE
      >> }
      >> record_stat 'start-time' $(date '+%s')
      >> pushd $BUILD_DIR
      >> record_stat 'du-sizes' $(du -s . *)
      >> popd
      >> ...
      >> record_stat 'end-time' $(date '+%s')
      >>
      >> And we'd have a record.  Or maybe buildbot even has a convention for this 
      >> already?
      > 
      > Doesn't seem to have one.  However, it might not be hard to hack a
      > builder shell command that includes arbitrary output like from du so
      > we can at least track that one in the web display. I'll look into
      > this.
      
      I don't know if it'll be that interesting.  It might be interesting.  If 
      we have a basic sort of structure, then saving the information won't be 
      very hard, and we can determine how interesting it is later, and ignore 
      it if it isn't interesting.
      
      > On a related note, there's an open buildbot ticket about putting
      > status information into a sql database, which might make various kinds
      > of ad-hoc queries easier. But it doesn't seem targeted at storing
      > arbitrary data like this.
      > http://buildbot.net/trac/ticket/24
      
      I would guess that parsing a thousand files and collating the 
      information wouldn't be all that slow.  If the records exist, then we 
      can figure out if there's a useful way to interpret them.
      
      Like, an interesting record would be the length of easy-install.pth, 
      which is about how many libraries are involved in a package.  But I 
      don't want to think about it much right *now* -- in a few month it'd be 
      interesting to look at.  If I could just add this to a shell file, 
      that'd be useful:
      
         echo ">>easy-install-length:" >> $STAT_FILE
         for EASY in */lib/python*/site-packages/easy-install.pth ; do
             echo $EASY $(wc -l $EASY) >> $STAT_FILE
         done
      
      Quite possibly analysis would most easily be done by putting it into a 
      database, but probably only after the fact.
      
         Ian
      
      • Re: buildbot stats

        from slinkp on Apr 30, 2008 08:43 PM
        On Wed, Apr 30, 2008 at 02:37:53PM -0500, Ian Bicking wrote:
        > I would guess that parsing a thousand files and collating the information 
        > wouldn't be all that slow.  If the records exist, then we can figure out if 
        > there's a useful way to interpret them.
        
        One thing that already exists is pickled status results from each
        build.  The master keeps these around forever (they're used for the
        web UI). These include each step's start and end times, blamelist, and
        a few other interesting things. For example, here's a little poking
        around, edited:
        
        $ cd buildmaster/livable-full
        $ python
        >>> import buildbot
        >>> import pickle
        >>> status = pickle.loads(open('42').read())
        >>> status.blamelist
        [u'pw']
        >>> status.number
        42
        >>> status.isFinished()
        True
        >>> status.steps
        [<buildbot.status.builder.BuildStepStatus instance at 0x2b444cb02e18>, 
        ...
        ]
        >>> flunc = status.steps[-2]
        >>> flunc.name
        'shell_11'
        >>> flunc.getText()
        ['run flunc tests', 'failed']
        >>> flunc.getTimes()
        (1207959887.425462, 1207960088.139046)
        >>> flunc.getLogs()
        [<buildbot.status.builder.LogFile instance at 0x2b444cb0cbd8>]
        >>> flunc.logs[0].filename
        '42-log-shell_11-stdio'
        
        
        That last one tells us the name of a file where the text from the
        buildslave's step was saved locally by the master.
        
        >>> fluncout = open(flunc.logs[0].filename).read()
        >>> print fluncout
        ...
                [*] running test: edit_project
                    ==> at
        	    http://localhost:7200/projects/testhaven/project-home
        ...
        
        So, it'd be pretty straightforward to write little scripts that open
        these pickles up, extract useful data, parse the stdio logs, etc.
        
        We could use your idea of marking lines of interest with the existing
        stdio logs on the buildmaster.  Then I don't have to write any code to
        make this work.  Might want to have an ending marker too.  And I'd
        like markers that don't visually resemble either shell redirection or
        a python prompt/doctest.  How bout something like ... oh i dunno,
        <STAT></STAT> ?
        
        > Like, an interesting record would be the length of easy-install.pth, which 
        > is about how many libraries are involved in a package.  But I don't want to 
        > think about it much right *now* -- in a few month it'd be interesting to 
        > look at.  If I could just add this to a shell file, that'd be useful:
        >
        >   echo ">>easy-install-length:" >> $STAT_FILE
        >   for EASY in */lib/python*/site-packages/easy-install.pth ; do
        >       echo $EASY $(wc -l $EASY) >> $STAT_FILE
        >   done
        
        Okay. Running arbitrary shell scripts on the slave has so far been
        kind of inconvenient, because it's designed around running single
        commands; so the slave either has to check out the script to run, or
        the master has to use a FileDownload command to send it along.  Both
        of those seem a little inconvenient for running a tiny script that I'd
        prefer to just keep in the master config.
        
        So I've just added a ToppShellScript class to the master, which allows
        me to do things like:
        
        script = """
        echo $1 | grep $2
        """
        
        buildfactory.addStep(ToppShellScript(
            sourcetext=script,
            scriptargs='"Oh yay I hope this works" "yay"',
            description=...))
        
        
        Builds are running now, if I haven't broken anything I'll try your
        easy-install-length script tomorrow.  Yep, they passed.
        
        -- 
        
        Paul Winkler
        http://www.openplans.org/people/slinkp/profile
        yahoo: slinkp23
        AIM:   slinkp1970
        
      • Re: buildbot stats

        from slinkp on Jun 02, 2008 02:58 PM
        (cleaning out my mailbox)
        
        I've decided that I'm not going to try to hack stats-gathering into
        Buildbot unless/until I get some time to investigate the possibility
        of switching from Buildbot to Bitten, which is designed for exactly
        this purpose.  (Ethan pointed it out to me; he and Jeff and I had a
        brief off-list discussion about it a while back and agreed we should
        look into it "when we have time.")
        
        From http://bitten.edgewall.org/wiki/WhitePaper :
        
         """"The goal of this work is to design and implement a distributed
         system for automated builds and continuous integration that allows
         the central collection and storage of software metrics generated
         during the build."""
        
        Here's an example: http://bitten.edgewall.org/build/trunk
        
        - PW
        
        
        On Wed, Apr 30, 2008 at 02:37:53PM -0500, Ian Bicking wrote:
        > Paul Winkler wrote:
        >> On Wed, Apr 30, 2008 at 10:34:27AM -0500, Ian Bicking wrote:
        >>> Is it possible to accumulate statistics from the buildbot runs? 
        >> Funny, just this morning I was thinking about accumulating some kind
        >> of history so you could track success/failure rate trends.
        >>> Specifically I thought it would be interesting to keep track of the size 
        >>> of our build, the time it takes to build, and the time to run the flunc
        >>> tests. 
        >> Start and stop times are already tracked by the buildmaster; the
        >> builder's steps each have a getTimes() method that returns a pair of
        >> timestamps.
        >> I'm not sure how useful times would be, since they vary quite a lot
        >> with system load.  Probably the late-night scheduled builds would be
        >> most informative here.
        >
        > Yes, being able to filter out builds based on time of day might be useful.  
        > There will also be a certain amount of noise in anything time-related, but 
        > we could still see trends as a result.
        >
        >>>  It wouldn't be particularly scientific, but it'd be interesting I think. 
        >>>  We'd just have to write to some file outside of the build, I guess?  
        >>> Maybe one file per build, structured in some way to make it easy to 
        >>> extract and easy to add new statistics, and probably written so you can 
        >>> just append to the file to track something new.
        >>>
        >>> Maybe each thing would start with '>>variable_name:' (something 
        >>> relatively unique so that we can have multi-line values).  Then we could 
        >>> do:
        >>>
        >>> STAT_FILE="/persistent-dir/stats/stat-$(date '+%Y-%m-%d-%s.txt')"
        >>> record_stat () {
        >>>     echo -n ">>$1: " >> $STAT_FILE
        >>>     shift
        >>>     echo "$@" >> $STAT_FILE
        >>>     echo >> $STAT_FILE
        >>> }
        >>> record_stat 'start-time' $(date '+%s')
        >>> pushd $BUILD_DIR
        >>> record_stat 'du-sizes' $(du -s . *)
        >>> popd
        >>> ...
        >>> record_stat 'end-time' $(date '+%s')
        >>>
        >>> And we'd have a record.  Or maybe buildbot even has a convention for this 
        >>> already?
        >> Doesn't seem to have one.  However, it might not be hard to hack a
        >> builder shell command that includes arbitrary output like from du so
        >> we can at least track that one in the web display. I'll look into
        >> this.
        >
        > I don't know if it'll be that interesting.  It might be interesting.  If we 
        > have a basic sort of structure, then saving the information won't be very 
        > hard, and we can determine how interesting it is later, and ignore it if it 
        > isn't interesting.
        >
        >> On a related note, there's an open buildbot ticket about putting
        >> status information into a sql database, which might make various kinds
        >> of ad-hoc queries easier. But it doesn't seem targeted at storing
        >> arbitrary data like this.
        >> http://buildbot.net/trac/ticket/24
        >
        > I would guess that parsing a thousand files and collating the information 
        > wouldn't be all that slow.  If the records exist, then we can figure out if 
        > there's a useful way to interpret them.
        >
        > Like, an interesting record would be the length of easy-install.pth, which 
        > is about how many libraries are involved in a package.  But I don't want to 
        > think about it much right *now* -- in a few month it'd be interesting to 
        > look at.  If I could just add this to a shell file, that'd be useful:
        >
        >   echo ">>easy-install-length:" >> $STAT_FILE
        >   for EASY in */lib/python*/site-packages/easy-install.pth ; do
        >       echo $EASY $(wc -l $EASY) >> $STAT_FILE
        >   done
        >
        > Quite possibly analysis would most easily be done by putting it into a 
        > database, but probably only after the fact.
        >
        >   Ian
        >
        > !DSPAM:4043,4818ca8d127682458217002!
        >
        
        -- 
        
        Paul Winkler
        http://www.openplans.org/people/slinkp/profile
        yahoo: slinkp23
        AIM:   slinkp1970