Logger safe async io #245

wclodius2 · 2020-11-16T20:45:17Z

Changes in four files:

doc/specs/index.md - added stdlib_logger.html to the index;
doc/specs/stdlib_logger.md - described the changes in stdlib_logger.f90;
src/tests/logger/test-stdlib_logger.f90 - added a blank after a comma in the output;
src/stdlib_logger.f90 - numerous changes
- the code handles new_line calls in the message string properly (added by @jvdp1, but undocumented in stdlib_logger.md)
- a character string buffer was added to logger_type and alen_buffer integer to track the number of active characters in the buffer;
- the code replaced multiple calls to write statements with single calls with line breaks generated by inserting new_line in the output string, in order to make the output thread and asynchronous IO safe;
- shortened a number of lines most of them 80+ characters long
- changed Rrea_only_error to read_only_error

Added a blank to the output afer a comma. [ticket: X]

Added one line to the index referencing stdlib_logger.html. [ticket: X]

Documented changes in stdlib_logger.f90. Among them: 1. Additions of buffer and len_buffer to the logger type 2. Changes of self from intent(in) to intent(inout) in several procedures 3. Allowing new_line call in messsage string [ticket: X]

Numerous changes to src/stdlib_logger.f90: 1. Changed `Rrea_...` to `read_...` 2. Shortened several long lines, mostly those longer than 80 characters 3. Added buffer and len_buffer to logger_type(?) 4. Changed output so that instead of multiple writes of individual strings there is a single write of a single string formatted by inserting new_line calls. [ticket: X]

Changed indexing of the self % buffer from 0 based to the proper 1s based, and eliminated some unused variables. [ticket: X]

[GH Workflow set environment variables](https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#environment-files)

milancurcic · 2020-11-16T22:18:31Z

@14NGiestas if your fix for docs build works, can you submit it as a separate PR? Then we can merge it and rebase this PR to master so that the docs build.

14NGiestas · 2020-11-16T22:21:52Z

@14NGiestas if your fix for docs build works, can you submit it as a separate PR? Then we can merge it and rebase this PR to master so that the docs build.

It is already a separate PR #246. I'm just waiting the CI to finish.

Small fix to issue #245

src/stdlib_logger.f90

Co-authored-by: Ian Giestas Pauli <[email protected]>

doc/specs/stdlib_logger.md

jvdp1

Two small issues discovered while testing the changes... There are still a few issues that I will try to solve.

src/stdlib_logger.f90

jvdp1

Issue: the end of a string with indent=.true. was not printed correctly.

The 3 proposed suggestions should solve this issue properly.

src/stdlib_logger.f90

Changed log_error and log_io_error similarly 1. Added dummy character variable of len 28 and used it to hold stat/iostat string representation or lack thereof. 2. Changed suffix to allocatable len character variable and used it to hold the combined stat/iostat and errmsg/iomsg string representations or lack thereof. 3. Eliminated the trim of suffix as unneeded. 4. Eliminated unit and passed -999 directly to handle_write_failure. 5. For log_io_error changed procedure_name to 'log_io_error'. [ticket: X]

wclodius2 · 2020-11-18T00:12:06Z

I have committed and pushed fixes for the unit = -999 and the potentially too short suffix in both log_error and log_io_error.

jvdp1

The changes sound good to me. Thank you @wclodius2 for these improvements!
IMO, this PR could be merged.

milancurcic

Thank you @wclodius2, I tested the changes locally.

Is it feasible to add a test that confirms that it's indeed async I/O safe? As I understand how this is implemented--write multi-lines into a character string, then write it to output in a single write statement--it should work as advertised. But a test program would help ensure this. What do you think?

Do you expect that this is async I/O safe also for SPMD programs using MPI or coarrays? Specifically, in a pattern where multiple processes are logging to the same file, or stdout. Granted, a safer programming pattern in this case would be for each process to write to its own dedicated logger file. Should we make note of this in the docs?

jvdp1 · 2020-11-22T17:33:51Z

Do you expect that this is async I/O safe also for SPMD programs using MPI or coarrays? Specifically, in a pattern where multiple processes are logging to the same file, or stdout. Granted, a safer programming pattern in this case would be for each process to write to its own dedicated logger file. Should we make note of this in the docs?

Would an OpenMP do loop with a logger do the trick? I have a similar example in one of my codes, however I put the logger in a critical OMP section with the first implementation.

wclodius2 · 2020-11-22T17:52:26Z

While a test would be useful I don't know how to write it. Parallel programming was not in demand for my work. In particular, how do we ensure that asynchronous calls aren't accidentally synchronous? Do we want to special case the cmake compiler flags and Makefile.manual flags for stdlib_logger.f90 and test_stdlib_logger.f90?

A note to the effect of "While every effort has been made to make the code thread and asynchronous I/O safe, it is always best to have each process write to its own dedicated logger file" might be useful.

On a related note, should I add an asynchronous argument to the subroutine add_log_file?

jvdp1 · 2020-11-23T19:28:57Z

While a test would be useful I don't know how to write it. Parallel programming was not in demand for my work. In particular, how do we ensure that asynchronous calls aren't accidentally synchronous? Do we want to special case the cmake compiler flags and Makefile.manual flags for stdlib_logger.f90 and test_stdlib_logger.f90?

Could such a test be satisfying, assuming that most compilers support basic OpenMP directives?

...
type(logger_type) :: logger
character (len = 30) :: cdumy
...

call logger % add_log_file( 'log.txt')
call logger % configure(indent = .true., time_stamp = .true.)
....
!$omp parallel default(none) shared(logger)
!$omp do
do i = 1, 100
  !$ write(cdummy, '(i0)') omp_get_thread_num()
  !$omp atomic
  call logger % log_message( ' Message from thread '//trim(cdummy) )'
enddo
!$omp end do
!$omp end parallel
....

Such a test would be more about thread then asynchronous safety.
To "mimick" MPI, an array of loggers could be used, and each thread would used its own logger.
However, what/how should be tested to have a proper test?

A note to the effect of "While every effort has been made to make the code thread and asynchronous I/O safe, it is always best to have each process write to its own dedicated logger file" might be useful.

IMO it is a good idea to add it in the specs, especially if we claim it should asynchornous/thread/process-safe.

On a related note, should I add an asynchronous argument to the subroutine add_log_file?

It would be good indeed. Probably it is best to do it in another PR independent from this one.

milancurcic · 2020-11-23T21:20:32Z

A note to the effect of "While every effort has been made to make the code thread and asynchronous I/O safe, it is always best to have each process write to its own dedicated logger file" might be useful.

@wclodius2, I agree. Do you mind adding this note in the spec document where you find it most appropriate?

I think this will then be good to merge.

Added the sentence "While every effort has been made to make the code thread and asynchronous I/O safe, it is always best to have each process write to its own dedicated logger file." [ticket: X]

wclodius2 · 2020-11-24T01:15:40Z

I have added the sentence to stdlib_logger.md just before the note at the end of the introduction.

@jvdp1 I have no experience with Open MP. My gut feeling is that an additional test wouldn't hurt, but it should probably be written by you as a separate PR. Things to think about if you should decide to do it.

It appears that compiling with gfortran for open MP requires the compiler flag -fopenmp. Would that flag apply to the library as a whole or just the test code?
Do you envision the code as a separate "application" or part of the current test case?
I find going through a hundred lines of output a bit daunting. Would ten lines be a sufficient test?

14NGiestas · 2020-11-24T06:48:53Z

Does this PR handles access to the buffer from different threads? Because having a buffer string without guarding the critical region does not guarantee one thread will not overwrite the buffer while the other thread was writing. I need to review the code to better understand that, but I belive It's up to the user put the logger call in a guarded region (omp critical, for e.g.).

jvdp1 · 2020-11-24T07:47:24Z

@wclodius2 I will propose another test later, when this is merged.

Does this PR handles access to the buffer from different threads? Because having a buffer string without guarding the critical region does not guarantee one thread will not overwrite the buffer while the other thread was writing. I need to review the code to better understand that, but I belive It's up to the user put the logger call in a guarded region (omp critical, for e.g.).

Good point. No, there is no guarding within stdlib_logger that different threads access to the buffer at the same time (that is why I set it in an atomic section in the example above). We could add orphaned OpenMP directives in stdlib_logger. However, as @14NGiestas , I would think it is up to the user to put the logger call in a guarded region. In that case, a sentence in the specs should be added for multi-thread programs.

Note that this issue should not apppear in MPI/coarray-type programs.

doc/specs/stdlib_logger.md

Added Jeremies revision of the discussion in stdlib_logger.md. [ticket: X]

wclodius2 · 2020-11-24T16:59:31Z

@14NGiestas FWIW the string buffer is in a logger object, and not a module variable. With a little extra work it could be made a procedure variable.

wclodius2 · 2020-11-24T19:03:24Z

FWIW with my latest push five of six CI/Builds passed. The one that failed "CI / Build (macos-latest, 9) (pull_request)" failed in running test_stdlib_bitset_large with a file read error. It passes that test on my system with gfortran 10.2.

jvdp1 · 2020-11-24T21:09:41Z

FWIW with my latest push five of six CI/Builds passed. The one that failed "CI / Build (macos-latest, 9) (pull_request)" failed in running test_stdlib_bitset_large with a file read error. It passes that test on my system with gfortran 10.2.

I tested it on my computer (Fedora) with gfortran 9.3.1, and there were no issues.
I rerun the tests, but they still failed.

wclodius2 · 2020-11-24T23:18:22Z

FWIW the Mac OS run that fails uses gfortran 9.3.0. It fails with an unexpected End-of-File, which I have trouble understanding as I would expect that error to occur only if there were a coding error that affects all runs. A glance at the code in stdlib_bitsets_large.f90 and test_stdlib_bitset_large.f90 doesn't show anything wrong with the pertinent sections of code. Further, CI / Build (ubuntu-latest, 9) also uses gfortran 9.3.0 and passes the tests. Maybe it is a bad disk sector involving the source code. I hate boggles!

jvdp1 · 2020-11-25T07:20:54Z

FWIW the Mac OS run that fails uses gfortran 9.3.0. It fails with an unexpected End-of-File, which I have trouble understanding as I would expect that error to occur only if there were a coding error that affects all runs. A glance at the code in stdlib_bitsets_large.f90 and test_stdlib_bitset_large.f90 doesn't show anything wrong with the pertinent sections of code. Further, CI / Build (ubuntu-latest, 9) also uses gfortran 9.3.0 and passes the tests. Maybe it is a bad disk sector involving the source code. I hate boggles!

I tested it again on my computer with different compilers and debug options. It always passed the tests! So no idea...

jvdp1 · 2020-11-26T16:23:59Z

doc/specs/stdlib_logger.md

+While every effort has been made to make the code process and
+asynchronous I/O safe, it is always best to have each process write to
+its own dedicated logger file.
+For thread parallelism (e.g., with OpenMP), it is advised to put the
+logger call in a guarding region (e.g., in an OpenMP critical region).
+


@milancurcic
Are you satisfied with this note that covers multi-thread and multi-process programs? If yes, I am ok to merge this PR.

Yes, I'm happy with it so I'll go ahead and merge. Thank you all.

wclodius2 added 5 commits November 16, 2020 11:25

Fixed minor formatting of output

716d2f5

Added a blank to the output afer a comma. [ticket: X]

Added logger to the index

1fa3a3e

Added one line to the index referencing stdlib_logger.html. [ticket: X]

Fixed indexing and unused variables

786aaa4

Changed indexing of the self % buffer from 0 based to the proper 1s based, and eliminated some unused variables. [ticket: X]

milancurcic mentioned this pull request Nov 16, 2020

Logger2 - Changes mostly to make output thread and Asynchronous iO safe #244

Closed

milancurcic added a commit that referenced this pull request Nov 16, 2020

Merge pull request #246 from 14NGiestas/patch-1

66272b5

Small fix to issue #245

Merge remote-tracking branch 'upstream/master' into logger-safe-async-io

cbb80a5

14NGiestas reviewed Nov 17, 2020

View reviewed changes

src/stdlib_logger.f90 Outdated Show resolved Hide resolved

Update src/stdlib_logger.f90

0b17754

Co-authored-by: Ian Giestas Pauli <[email protected]>