Skip to content

test: spectralnorm takes 20+ minutes on linux-arm-arm5 #12688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bradfitz opened this issue Sep 19, 2015 · 17 comments
Closed

test: spectralnorm takes 20+ minutes on linux-arm-arm5 #12688

bradfitz opened this issue Sep 19, 2015 · 17 comments
Milestone

Comments

@bradfitz
Copy link
Contributor

I notice that the shootout test "spectralnorm" runs for over 20 minutes on linux-arm-arm5

I logged in to one of the machines and indeed, it's working hard, spinning away.

Some looking around...

 1740 ?        Ss     0:04 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 104:108
30272 ?        Ssl    0:01 /usr/local/bin/buildlet-stage0
30278 ?        Sl     0:30  \_ ./buildlet.exe --workdir=/workdir --hostname=scaleway-prod-09 --halt=false --reverse=linux-arm,linux-arm-arm5 --coordinato
22948 ?        Sl     0:00      \_ /workdir/go/bin/go tool dist test --no-rebuild --banner=XXXBANNERXXX: shootout:spectralnorm
22953 ?        Sl     0:00          \_ /workdir/go/pkg/tool/linux_arm/dist test --no-rebuild --banner=XXXBANNERXXX: shootout:spectralnorm
22964 ?        S      0:00              \_ bash ./timing.sh -test spectralnorm
23008 ?        Rl    11:02                  \_ a.out

root@buildlet-prep:~# cat /proc/23008/environ
GOHOSTARCH=armGO_TEST_TIMEOUT_SCALE=5TERM=dumbGOROOT_BOOTSTRAP=/usr/local/goWORKDIR=/workdirGOTOOLDIR=/workdir/go/pkg/tool/linux_armGOGCCFLAGS=-fPIC -marm -pthread -fmessage-length=0GO_BUILDER_NAME=linux-arm-arm5PATH=.:/workdir/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binGOARM=5PWD=/workdir/go/test/bench/shootoutGOARCH=armLANG=CGO386=387CGO_ENABLED=1CXX=g++SHLVL=1LANGUAGE=en_US.UTF8GOROOT=/workdir/goGOOS=linuxGOHOSTOS=linuxCC=gcc_=./a.out

root@buildlet-prep:~# cat /proc/22964/environ
PATH=/workdir/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binGOROOT_BOOTSTRAP=/usr/local/goWORKDIR=/workdirGO_BUILDER_NAME=linux-arm-arm5GOARM=5GO_TEST_TIMEOUT_SCALE=5GOROOT=/workdir/goGOARCH=armGOHOSTARCH=armGOHOSTOS=linuxGOOS=linuxGOTOOLDIR=/workdir/go/pkg/tool/linux_armTERM=dumbCC=gccGOGCCFLAGS=-fPIC -marm -pthread -fmessage-length=0CXX=g++CGO_ENABLED=1GO386=387LANG=CLANGUAGE=en_US.UTF8root@buildlet-prep:~# 

root@buildlet-prep:~# stat /proc/22964; date
  File: '/proc/22964'
  Size: 0           Blocks: 0          IO Block: 1024   directory
Device: 3h/3d   Inode: 496751      Links: 7
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2015-09-19 02:28:48.417888821 +0000
Modify: 2015-09-19 02:28:48.417888821 +0000
Change: 2015-09-19 02:28:48.417888821 +0000
 Birth: -
Sat Sep 19 02:29:38 UTC 2015

@davecheney, can you investigate?

Should I disable these tests on that builder?

/cc @minux @adg

@bradfitz
Copy link
Contributor Author

Oh, and another point: 20 minutes is the new max timeout that the coordinator permits for any particular dist test.

It can be configured per (builder, test) tuple, though, if we need to.

But please let me know whether this is expected first.

@davecheney
Copy link
Contributor

This is because spec norm uses lots of floating point ops.

I'll try to disabled it in ./timing.sh

On Sat, Sep 19, 2015 at 12:36 PM, Brad Fitzpatrick <[email protected]

wrote:

Oh, and another point: 20 minutes is the new max timeout that the
coordinator permits for any particular dist test.

It can be configured per (builder, test) tuple, though, if we need to.

But please let me know whether this is expected first.


Reply to this email directly or view it on GitHub
#12688 (comment).

@minux
Copy link
Member

minux commented Sep 19, 2015 via email

@bradfitz
Copy link
Contributor Author

Why did this only recently become a problem? The timeout used to be only 10 minutes. I raised it to 20 minutes and sharded the shootout tests and only then started seeing it. Maybe I wasn't looking before?

Or was the GOARM detection just broken before and we weren't making GOARM=5 binaries previously?

@davecheney
Copy link
Contributor

Likely the latter.

On Sat, Sep 19, 2015 at 1:04 PM, Brad Fitzpatrick [email protected]
wrote:

Why did this only recently become a problem? The timeout used to be only
10 minutes. I raised it to 20 minutes and sharded the shootout tests
and only then started seeing it. Maybe I wasn't looking before?

Or was the GOARM detection just broken before and we weren't making
GOARM=5 binaries previously?


Reply to this email directly or view it on GitHub
#12688 (comment).

@davecheney
Copy link
Contributor

Or perhaps more correctly, overriding GOARM detection was broken til
recently

On Sat, Sep 19, 2015 at 1:09 PM, Dave Cheney [email protected] wrote:

Likely the latter.

On Sat, Sep 19, 2015 at 1:04 PM, Brad Fitzpatrick <
[email protected]> wrote:

Why did this only recently become a problem? The timeout used to be only
10 minutes. I raised it to 20 minutes and sharded the shootout tests
and only then started seeing it. Maybe I wasn't looking before?

Or was the GOARM detection just broken before and we weren't making
GOARM=5 binaries previously?


Reply to this email directly or view it on GitHub
#12688 (comment).

@minux
Copy link
Member

minux commented Sep 19, 2015 via email

@bradfitz
Copy link
Contributor Author

I think before we always use GOARM=7 (probably another reason is we used to
not run the shoutout test or didn't run on GOARM=5 builder.)

No, we always did. It wasn't special until just the other day when I experimentally disabled it.

@minux
Copy link
Member

minux commented Sep 19, 2015 via email

@bradfitz
Copy link
Contributor Author

A bisection would be great. It's probably more likely a GOARM detection bug, but knowing that too would be interesting.

@davecheney
Copy link
Contributor

I don't think there is a performance regression. Previously the trybots
always ran that test with GOARM=7 (it use go tool dist env, not go env)
even on the linux-arm-arm5 builder. Not this test is run with GOARM=5 on
linux-arm-arm5.

On Sat, Sep 19, 2015 at 1:19 PM, Brad Fitzpatrick [email protected]
wrote:

A bisection would be great. It's probably more likely a GOARM detection
bug, but knowing that too would be interesting.


Reply to this email directly or view it on GitHub
#12688 (comment).

@bradfitz
Copy link
Contributor Author

(Nit: linux-arm-arm5 was never a trybot)

Which CL changed this test to use GOARM=5 by using a different go {tool dist,} env whatever?

@davecheney
Copy link
Contributor

3f2baa3

Previously go tool dist env returns GOARM=7, afterwards it returned GOARM=5
on all arm platforms, after 1fd78e1 go
tool dist env returns GOARM=5 where GOARM=5 was passed to the bootstrap
stage.

On Sat, Sep 19, 2015 at 1:29 PM, Brad Fitzpatrick [email protected]
wrote:

(Nit: linux-arm-arm5 was never a trybot)

Which CL changed this test to use GOARM=5 by using a different go {tool
dist,} env whatever}


Reply to this email directly or view it on GitHub
#12688 (comment).

bradfitz added a commit that referenced this issue Sep 19, 2015
Temporary fix to get the arm5 builder happy again.

Without hardware floating point, this test takes over 20 minutes to
run.

A proper solution would probably be to run all the benchmark tests,
but with a much lower iteration count, just to exercise the code.

Updates #12688

Change-Id: Ie56c93d3bf2a5a693a33217ba1b1df3c6c856442
Reviewed-on: https://go-review.googlesource.com/14775
Reviewed-by: Dave Cheney <[email protected]>
Reviewed-by: Minux Ma <[email protected]>
@rsc rsc added this to the Go1.6 milestone Oct 23, 2015
@minux
Copy link
Member

minux commented Nov 14, 2015 via email

@davecheney
Copy link
Contributor

SGTM. I'm sorry, I don't remover the specifics of how we did this for arm5.

On 14 Nov 2015, at 12:35, Minux Ma [email protected] wrote:

This test takes more than 33 minutes on the FPU less mips64 builder,
could we skip it there too?

Reply to this email directly or view it on GitHub.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/16922 mentions this issue.

minux added a commit that referenced this issue Nov 14, 2015
It is too slow with kernel FPU emulator.

Updates #12688.

Change-Id: Ib3a5adfeb46e894550231b14eb0f4fb20aecee11
Reviewed-on: https://go-review.googlesource.com/16922
Reviewed-by: Brad Fitzpatrick <[email protected]>
@rsc
Copy link
Contributor

rsc commented Nov 24, 2015

The tests are disabled now. That sounds fixed to me. (Software floating point is slow.)

@rsc rsc closed this as completed Nov 24, 2015
rsc added a commit that referenced this issue Jan 6, 2016
We don't use these for benchmarking anymore.
Now we have the go1 dir and the benchmarks subrepo.
Some have problematic copyright notices, so move out of main repo.

Preserved in golang.org/x/exp/shootout.

Fixes #12688.
Fixes #13584.

Change-Id: Ic0b71191ca1a286d33d7813aca94bab1617a1c82
Reviewed-on: https://go-review.googlesource.com/18320
Reviewed-by: Ian Lance Taylor <[email protected]>
@golang golang locked and limited conversation to collaborators Nov 27, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants