Skip to content

Conversation

@user202729
Copy link
Contributor

@user202729 user202729 commented Aug 30, 2025

Singular keeps failing, and I can't reproduce it locally. Will try debug it on the CI.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

@github-actions
Copy link

github-actions bot commented Aug 30, 2025

Documentation preview for this PR (built with commit 548c26e; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

@user202729
Copy link
Contributor Author

user202729 commented Aug 30, 2025

I also try to apt install ncdu then see what takes so much space. In one of the run I get a

System.IO.IOException: No space left on device : '/home/runner/actions-runner/cached/_diag/Worker_20250830-073157-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
   at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
   at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
System.IO.IOException: No space left on device : '/home/runner/actions-runner/cached/_diag/Worker_20250830-073157-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
   at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
   at GitHub.Runner.Common.Tracing.Error(Exception exception)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
Unhandled exception. System.IO.IOException: No space left on device : '/home/runner/actions-runner/cached/_diag/Worker_20250830-073157-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at System.Diagnostics.TraceSource.Flush()
   at GitHub.Runner.Common.Tracing.Dispose(Boolean disposing)
   at GitHub.Runner.Common.Tracing.Dispose()
   at GitHub.Runner.Common.TraceManager.Dispose(Boolean disposing)
   at GitHub.Runner.Common.TraceManager.Dispose()
   at GitHub.Runner.Common.HostContext.Dispose(Boolean disposing)
   at GitHub.Runner.Common.HostContext.Dispose()
   at GitHub.Runner.Worker.Program.Main(String[] args)

Might be something malfunctioning, since usually the disk usage is like (even with non-editable install)

Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/root       75085112 59355928  15712800  80% /

The following is an investigation of a editable install (thus likely take less memory)

   38.8 GiB [##############] /usr                                                                   
.   9.7 GiB [###           ] /home                                                                  
    7.8 GiB [##            ] /opt                                                                   
.   4.0 GiB [#             ] /mnt                                                                   
--- /usr/local/lib ---------------------------------------------------------------------------------
                             /..                                                                    
    9.5 GiB [##############] /android                                                               
  463.8 MiB [              ] /node_modules                                                          
    1.4 MiB [              ] /python3.12                                                            

surely we don't need the Android SDK?

--- /usr/local/.ghcup/ghc --------------------------------------------------------------------------
                             /..                                                                    
    3.5 GiB [##############] /9.12.2                                                                
    2.8 GiB [###########   ] /9.10.2                                                                
--- /usr/share -------------------------------------------------------------------------------------
                             /..                                                                    
    5.6 GiB [##############] /miniconda                                                             
    2.7 GiB [######        ] /swift                                                                 
  495.4 MiB [#             ] /az_12.5.0                                                             
--- /usr/lib ---------------------------------------------------------------------------------------
                             /..                                                                    
    1.1 GiB [##############] /jvm                                                                   
    1.0 GiB [############  ] /x86_64-linux-gnu                                                      
  950.7 MiB [###########   ] /google-cloud-sdk                                                      
  730.6 MiB [########      ] /llvm-18                                                               
  587.3 MiB [#######       ] /llvm-16                                                               
  583.6 MiB [#######       ] /llvm-17                                                               

@user202729
Copy link
Contributor Author

Running coredumpctl give some insights.

Sat 2025-08-30 09:53:29 UTC 26702 1001 118 SIGQUIT present  /usr/share/miniconda/envs/sage-dev/bin/python3.12  49.0M
Sat 2025-08-30 09:58:35 UTC 39068 1001 118 SIGABRT present  /usr/share/miniconda/envs/sage-dev/bin/mwrank     357.0K
Sat 2025-08-30 09:58:35 UTC 39085 1001 118 SIGABRT present  /usr/share/miniconda/envs/sage-dev/bin/mwrank     352.0K
Sat 2025-08-30 10:07:19 UTC 53311 1001 118 SIGSEGV present  /usr/share/miniconda/envs/sage-dev/bin/python3.12  68.3M

I guess the first 3 are just always there. Still, they shouldn't be core dumped though.

@user202729
Copy link
Contributor Author

user202729 commented Aug 30, 2025

gdb in one of the plural fail

#0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=11, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
#3  0x00007f17a344527e in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#4  0x00007f17a2d1df79 in sigdie () from /usr/share/miniconda/envs/sage-dev/lib/python3.12/site-packages/cysignals/signals.cpython-312-x86_64-linux-gnu.so
#5  0x00007f17a2d208f7 in cysigs_signal_handler () from /usr/share/miniconda/envs/sage-dev/lib/python3.12/site-packages/cysignals/signals.cpython-312-x86_64-linux-gnu.so
#6  <signal handler called>
#7  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:76
#8  0x000055f4caaedd2d in PyBytes_FromString (str=0x0) at /usr/local/src/conda/python-3.12.11/Objects/bytesobject.c:151
#9  0x00007f174a1e2a31 in ?? ()
#10 0x000055f4cadc0320 in ?? ()
#11 0x000055f4caebc910 in _PyRuntime ()
#12 0x00007f173e554820 in ?? ()
#13 0x2be97660cd1ad400 in ?? ()
#14 0x00007f17a32fd0a0 in ?? ()
#15 0x00007f174a268280 in ?? ()
#16 0x000055f4caebc910 in _PyRuntime ()
#17 0x0000000000000000 in ?? ()

why is str=0x0???

edit: maybe that's unrelated.

edit: it's indeed unrelated, there's a test in decorate.py that makes the subprocess segmentation fault.

@user202729 user202729 mentioned this pull request Aug 30, 2025
5 tasks
@user202729 user202729 force-pushed the singular-debug branch 2 times, most recently from 898af7f to 7d8078d Compare August 30, 2025 13:49
@user202729
Copy link
Contributor Author

A traceback

#4  0x00007f1c35b5af79 in sigdie ()
   from /usr/share/miniconda/envs/sage-dev/lib/python3.12/site-packages/cysignals/signals.cpython-31
2-x86_64-linux-gnu.so
#5  0x00007f1c35b5d8f7 in cysigs_signal_handler ()
   from /usr/share/miniconda/envs/sage-dev/lib/python3.12/site-packages/cysignals/signals.cpython-31
2-x86_64-linux-gnu.so
#6  <signal handler called>
#7  musable (mem=0x6c2e626765657266) at ./malloc/malloc.c:5238
#8  __malloc_usable_size (m=0x6c2e626765657266) at ./malloc/malloc.c:5252
#9  0x00007f1bd885697a in sattr::kill(ip_sring*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
#10 0x00007f1bd8856a68 in sattr::killAll(ip_sring*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
#11 0x00007f1bd88c6894 in killhdl2(idrec*, idrec**, ip_sring*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
#12 0x00007f1bd88d23da in killlocals_rec(idrec**, int, ip_sring*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
#13 0x00007f1bd88d2419 in killlocals_rec(idrec**, int, ip_sring*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
#14 0x00007f1bd88d9243 in killlocals(int) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
#15 0x00007f1bd88c9f81 in iiPStart(idrec*, sleftv*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
#16 0x00007f1bd88ca3d2 in iiMake_proc(idrec*, sip_package*, sleftv*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
--Type <RET> for more, q to quit, c to continue without paging--c
#17 0x00007f1bd88cbf34 in iiLoadLIB(_IO_FILE*, char const*, char const*, idrec*, int, int) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so
#18 0x00007f1bd88cc0b5 in iiLibCmd(char const*, int, int, int) ()
   from /usr/share/miniconda/envs/sage-dev/lib/libSingular-4.4.1.so

#19 0x00007f1bd81d8b98 in __pyx_pw_4sage_4libs_8singular_8function_11lib(_object*, _object[238/1562]
long, _object*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/python3.12/site-packages/sage/libs/singular/function.
cpython-312-x86_64-linux-gnu.so
#20 0x00007f1bcdd8af8b in __pyx_pymod_exec_free_algebra_element_letterplace(_object*) ()
   from /usr/share/miniconda/envs/sage-dev/lib/python3.12/site-packages/sage/algebras/letterplace/fr
ee_algebra_element_letterplace.cpython-312-x86_64-linux-gnu.so
#21 0x0000556ece6df6c0 in PyModule_ExecDef (module=0x7f1bcdea4360, def=<optimized out>)
    at /usr/local/src/conda/python-3.12.11/Objects/moduleobject.c:440
#22 0x0000556ece6e4159 in _imp_exec_builtin_impl (mod=<optimized out>, module=<optimized out>)
    at /usr/local/src/conda/python-3.12.11/Python/import.c:3832
#23 _imp_exec_builtin (module=<optimized out>, mod=<optimized out>)
    at /usr/local/src/conda/python-3.12.11/Python/clinic/import.c.h:564
#24 0x0000556ece642518 in cfunction_vectorcall_O (func=0x7f1c3619dc60, args=0x7f1bcdea0b98, 
    nargsf=<optimized out>, kwnames=<optimized out>)
    at /usr/local/src/conda/python-3.12.11/Include/cpython/methodobject.h:50
#25 0x0000556ece53a6f0 in PyCFunction_Call (kwargs=0x7f1bcde98bc0, args=0x7f1bcdea0b80, 
    callable=0x7f1c3619dc60) at /usr/local/src/conda/python-3.12.11/Objects/call.c:387
#26 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=0x7f1c36429660, 
    throwflag=<optimized out>) at Python/bytecodes.c:3263
#27 0x0000556ece6428ee in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=2, args=0x7fffe36b4c10, 
    callable=0x7f1c361a4040, tstate=0x556eceaaf910 <_PyRuntime+458992>)
    at /usr/local/src/conda/python-3.12.11/Include/internal/pycore_call.h:92
#28 object_vacall (tstate=tstate@entry=0x556eceaaf910 <_PyRuntime+458992>, base=<optimized out>, 
    callable=0x7f1c361a4040, vargs=0x7fffe36b4ca0)
    at /usr/local/src/conda/python-3.12.11/Objects/call.c:850
#29 0x0000556ece66ba80 in PyObject_CallMethodObjArgs (obj=<optimized out>, 
    name=name@entry=0x556ecea481b8 <_PyRuntime+35224>)
    at /usr/local/src/conda/python-3.12.11/Objects/call.c:911
#30 0x0000556ece66aa5d in import_find_and_load (abs_name=0x7f1bd035a950, 
    tstate=0x556eceaaf910 <_PyRuntime+458992>)
    at /usr/local/src/conda/python-3.12.11/Python/import.c:2793

... <probably unimportant>

    at /usr/local/src/conda/python-3.12.11/Python/pythonrun.c:1757
#106 0x0000556ece70c585 in run_mod (mod=mod@entry=0x556ef786b928, 
    filename=filename@entry=0x7f1c35d38390, globals=globals@entry=0x7f1c361fea00, 
    locals=locals@entry=0x7f1c361fea00, flags=flags@entry=0x7fffe36b71b0, 
    arena=arena@entry=0x7f1c3611bc70)
    at /usr/local/src/conda/python-3.12.11/Python/pythonrun.c:1778
#107 0x0000556ece709620 in pyrun_file (fp=fp@entry=0x556ef77cd490, 
    filename=filename@entry=0x7f1c35d38390, start=start@entry=257, 
    globals=globals@entry=0x7f1c361fea00, locals=locals@entry=0x7f1c361fea00, 
    closeit=closeit@entry=1, flags=0x7fffe36b71b0)
    at /usr/local/src/conda/python-3.12.11/Python/pythonrun.c:1674
#108 0x0000556ece7092be in _PyRun_SimpleFileObject (fp=0x556ef77cd490, filename=0x7f1c35d38390, 
    closeit=1, flags=0x7fffe36b71b0) at /usr/local/src/conda/python-3.12.11/Python/pythonrun.c:459
#109 0x0000556ece708fe4 in _PyRun_AnyFileObject (fp=0x556ef77cd490, 
    filename=filename@entry=0x7f1c35d38390, closeit=closeit@entry=1, 
    flags=flags@entry=0x7fffe36b71b0) at /usr/local/src/conda/python-3.12.11/Python/pythonrun.c:78
#110 0x0000556ece705eb2 in pymain_run_file_obj (skip_source_first_line=0, filename=0x7f1c35d38390, 
    program_name=0x7f1c361f9950) at /usr/local/src/conda/python-3.12.11/Modules/main.c:361
#111 pymain_run_file (config=0x556ecea524f0 <_PyRuntime+77008>)
    at /usr/local/src/conda/python-3.12.11/Modules/main.c:380
#112 pymain_run_python (exitcode=0x7fffe36b7184)
    at /usr/local/src/conda/python-3.12.11/Modules/main.c:634
#113 Py_RunMain () at /usr/local/src/conda/python-3.12.11/Modules/main.c:714
#114 0x0000556ece6c1247 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>)
    at /usr/local/src/conda/python-3.12.11/Modules/main.c:768
#115 0x00007f1c3622a1ca in __libc_start_call_main (main=main@entry=0x556ece6c1190 <main>, 
    argc=argc@entry=6, argv=argv@entry=0x7fffe36b7418) at ../sysdeps/nptl/libc_start_call_main.h:58
#116 0x00007f1c3622a28b in __libc_start_main_impl (main=0x556ece6c1190 <main>, argc=6, 
    argv=0x7fffe36b7418, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7fffe36b7408) at ../csu/libc-start.c:360
#117 0x0000556ece6c10ed in _start ()

not all that surprising.

I tried to look at this problem somewhat at #39628 , but I haven't figured out why yet.

@user202729 user202729 closed this Sep 3, 2025
vbraun pushed a commit to vbraun/sage that referenced this pull request Sep 24, 2025
sagemathgh-40814: Rerun plural and singular/function on failure
    
This pull request:

* add new feature `--all-except` to `sage -t` (does what you expect)
* modify `ci-meson.yml` to workaround
sagemath#29528 , the cause of which is
yet unknown. (I also tried porting an old pull request that purportedly
fix the issue at sagemath#39628, but the
result is even worse.)

controlling this in bash seems easier than
sagemath#39539 , for now. I suspect testing
these files separately will make it stop failing however (doesn't really
matter, the bug remains).

sagemath#40729 (comment)
contains a traceback, but I think it isn't of too much help.

(Thought? Is `--all --exclude=a --exclude=b` better?)

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [ ] The title is concise and informative.
- [ ] The description explains in detail what this PR is about.
- [ ] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#40814
Reported by: user202729
Reviewer(s): Tobias Diez
vbraun pushed a commit to vbraun/sage that referenced this pull request Sep 27, 2025
sagemathgh-40814: Rerun plural and singular/function on failure
    
This pull request:

* add new feature `--all-except` to `sage -t` (does what you expect)
* modify `ci-meson.yml` to workaround
sagemath#29528 , the cause of which is
yet unknown. (I also tried porting an old pull request that purportedly
fix the issue at sagemath#39628, but the
result is even worse.)

controlling this in bash seems easier than
sagemath#39539 , for now. I suspect testing
these files separately will make it stop failing however (doesn't really
matter, the bug remains).

sagemath#40729 (comment)
contains a traceback, but I think it isn't of too much help.

(Thought? Is `--all --exclude=a --exclude=b` better?)

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [ ] The title is concise and informative.
- [ ] The description explains in detail what this PR is about.
- [ ] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#40814
Reported by: user202729
Reviewer(s): Tobias Diez
@user202729 user202729 changed the title Debug with tmate Debug Singular segmentation fault with tmate Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant