Skip to content

logging: Fix creating empty file even when disabled #2966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

KerfuffleV2
Copy link
Collaborator

Logging doesn't seem to fully respect being set to disabled: it'll still create the logfile, just not write anything to it. So, for example, if you pass --log-disable to main you'll still end up with a bunch of empty logfiles.

This seems like the obvious fix and as far as I can see doesn't break stuff with logging enabled or disabled but my understanding of the log system here is very rudimentary.

Going to mark this as a bug since even if my approach to fixing it is wrong, surely creating a bunch of log files when logging is disabled can't be intended behavior.

@KerfuffleV2 KerfuffleV2 added the bug Something isn't working label Sep 2, 2023
@KerfuffleV2 KerfuffleV2 requested a review from staviq September 2, 2023 09:23
@KerfuffleV2
Copy link
Collaborator Author

Follow up question: I see the logging uses a bunch of static variables. I'm not that familiar with threading in C++ but as far as I can see initializing static variables is thread-safe but accessing them isn't. However it doesn't look like the logging code protects access with a mutex or anything like that. Am I missing something and it's actually safe?

@staviq
Copy link
Contributor

staviq commented Sep 2, 2023

Follow up question: I see the logging uses a bunch of static variables. I'm not that familiar with threading in C++ but as far as I can see initializing static variables is thread-safe but accessing them isn't. However it doesn't look like the logging code protects access with a mutex or anything like that. Am I missing something and it's actually safe?

No, you aren't missing anything, it's just that logs were originally requested for the purpose of helping debug main within a non-threaded context.

I mentioned in the original PR comments how I would imagine thread safety for logs, but I didn't do it to not fall into scope creep

I'm absolutely in favour of thread safety if there is a use case for it now.

Edit: Found it, this was my concept: #2748 (comment)

@staviq
Copy link
Contributor

staviq commented Sep 2, 2023

Actually, that empty logfile thing, is somewhat related to logs currently being non-threadsafe and "mono state", as in, the state is stored in those static variables, inside log handler function body.

I would imagine, expanding logs for threaded use, would likely require splitting them into separate states of local context, and passing some local state struct to log macros, which would likely solve the problem of generating empty logfile too.

Basically, a logfile is created on the first call to log handler macro, so if logs get disabled runtime, and a log macro got used prior to that, an empty log file would be created.

I didn't want to mess with existing arg parsing, so I just added mine inside gpt_params_parse, and that sort of forced a certain order of things.

Possibly, this could be solved by introducing a log_init() or something to that extent ?

I'll test your proposed fix when I get back home, it looks reasonable

@staviq
Copy link
Contributor

staviq commented Sep 2, 2023

I found something else, disabling logs via --log-disable also disables TEE prints, which in case of main would disable all explicit stderr prints too

Hmmm.

I need opinion on this:

Disabling logs via compile-time define flag, converts all LOG_TEE back into ordinary fprintf(stderr, ...), basically "restoring" old behaviour of main
Should runtime --log-disable work the same way ?

Currently, --log-disable lets completely disable any log prints, direct LOG prints, and LOG_TEE prints too, so only non-log stdout prints remain. This makes for much much cleaner output, but I'm not sure if that's how other imagined it.

@KerfuffleV2
Copy link
Collaborator Author

Disabling logs via compile-time define flag, converts all LOG_TEE back into ordinary fprintf(stderr, ...), basically "restoring" old behaviour of main

Should runtime --log-disable work the same way ?

I'd say so.

This makes for much much cleaner output

The funny thing is it doesn't really eliminate the main output spam which is dumping all the tensors.


I'm absolutely in favour of thread safety if there is a use case for it now.

Ah, I couldn't say anything about that either way. It's just something I noticed since I was looking at the code. I guess the thing to consider would be: can someone accidentally use it without realizing that it's not actually thread safe? If I was writing an example or whatever I'd think I'd could use it since it's in common/ and gets pulled in by common.h. Might be worth at least adding some comments noting that the logging feature isn't thread safe?

@staviq
Copy link
Contributor

staviq commented Sep 2, 2023

Ok, this PR fixes unnecessary empty logfiles, but prevents logtarget being changed while logs are disabled , so my --log-test fails

However, I don't see it being an issue in practice, since --log-disable should supersede log-enable() calls anyway.

So I'm calling it good to go

And I think I will open a separate PR for log improvements and thread safety.

@KerfuffleV2 KerfuffleV2 merged commit 3358c38 into ggml-org:master Sep 2, 2023
@KerfuffleV2
Copy link
Collaborator Author

And I think I will open a separate PR for log improvements and thread safety.

I'll probably see it anyway, but feel free to @ me or request a review for those changes if you end up making a pull.

@staviq
Copy link
Contributor

staviq commented Sep 4, 2023

@KerfuffleV2 @cebtenzzre
If you find a spare moment, I would welcome a sanity check on logging thread-safety PoC, specifically LOG_TS_WRAPPER / LOG_TS_WRAPPER_2. There is some testing code in #ifdef LOG_TS_TEST:

https://github.com/staviq/llama.cpp/blob/acfa2ff1f5886a2c0b6c2c9e37de19384c0f34ba/common/log.h#L68

@KerfuffleV2
Copy link
Collaborator Author

@staviq I'll check this out tomorrow but a sanity check from me is unfortunately not worth that much. I'm not a C++ person and basically only knew enough to notice that there didn't seem to be thread safety stuff in the logging when I looked at the code. Threading stuff is notoriously hard to get right and I know very little about threading in C++, so a problem would probably have to be really, really obvious for me to notice it when looking at the code.

I can also try out the changes (on Linux) and see if they work.

@staviq
Copy link
Contributor

staviq commented Sep 4, 2023

No problem :)
I'll probably just do the PR as a draft, sometime tomorrow.

@cebtenzzre
Copy link
Collaborator

cebtenzzre commented Sep 5, 2023

@staviq I don't think your locking mechanism is race-free, because if test_and_set returns nonzero, there is no guarantee that we make into cv.wait before the conflicting access clears the flag, notifies, and returns. The first parameter to the condition variable exists for a reason - the use of it must be guarded by a meaningful lock in order for it to be safe. Guarding only cv.wait with a mutex does nothing.

If making std::mutex more efficient were this easy, it would already be built-in :)

@KerfuffleV2
Copy link
Collaborator Author

I looked at the code more in depth, but as expected I really don't have anything much to offer as far as being able to evaluate whether your approach is correct or not. Pretty much the only thing I can suggest is that since this isn't really performance sensitive code you could just take the brute force approach of using a global logging mutex. That way you can just acquire the lock at the very start and really only have to worry about dropping it (not sure if it can even be done automatically with C++?) when you exit the function.

That's likely to be a lot simpler to manage than trying to be fancy. It's not pretty just globally locking/unlocking for every log line but the typical case is just going to be a single threaded environment with contention so I'd imagine it wouldn't even show up in performance metrics.

@staviq
Copy link
Contributor

staviq commented Sep 5, 2023

@staviq I don't think your locking mechanism is race-free, because if test_and_set returns nonzero, there is no guarantee that we make into cv.wait before the conflicting access clears the flag, notifies, and returns. The first parameter to the condition variable exists for a reason - the use of it must be guarded by a meaningful lock in order for it to be safe. Guarding only cv.wait with a mutex does nothing.

If making std::mutex more efficient were this easy, it would already be built-in :)

Oh, I see it now, specifically section between unique_lock and cc.wait will not be synchronised with clear and notify. I know why my test seemed to work, every subsequent call would "push" this potential block forward, and only the last call had a chance of desync, so it didn't matter that I ran 10k threads for testing.

My dillema is that mutexes are slow, and logs already introduce a small reduction in overall performance.

It's a bit of a shame we can't go up to cpp20 because it introduces wait on atomic_lock which would make test/wait/clear properly race free.

I think I need to revert to normal mutex but limit its use to explicitly modifying calls (log enable/disable/set_target, wrap log FILE* handler in atomic, and separate log_handler into mutex guarded configuration handler and read only mutex free fetch for actual FILE* target.

Thank you :)

@KerfuffleV2
Copy link
Collaborator Author

My dillema is that mutexes are slow, and logs already introduce a small reduction in overall performance.

Are you sure it will really matter in the case where there's no contention? I'm not an expert on C++ mutexes but generally speaking I'd really expect stuff like formatting the strings and writing the file to take way, way more time than acquiring a mutex lock in a single threaded situation.

It's a bit of a shame we can't go up to cpp20

Let's just switch to Rust. :)

@staviq
Copy link
Contributor

staviq commented Sep 5, 2023

Are you sure it will really matter in the case where there's no contention?

I'm starting to notice I have a lot of habits and possibly unwarranted aversions toward certain patterns or solutions, from years of writing realtime/embedded code in C :)

I think I can deterministically separate operations requiring "global" mutex from those where std::atomic would suffice, which should achieve mostly the same performance as the atomic_lock solution, and it will probably end up looking cleaner too.

@KerfuffleV2
Copy link
Collaborator Author

I'm starting to notice I have a lot of habits and possibly unwarranted aversions toward certain patterns or solutions

If it makes you feel any better, I'm (also?) really bad about premature optimization and getting fixated on doing stuff the nice/pretty way even when it doesn't matter. It's a lot easier to give the kind of advice in my last comment than to take it!

@KerfuffleV2 KerfuffleV2 deleted the fix-log-disabled-empty-logfile branch September 6, 2023 08:50
@staviq
Copy link
Contributor

staviq commented Sep 11, 2023

I tried several different approaches for thread safety, including explicit mutex(es), and any kind of cleverness get replaced with mutex by the compiler anyway.

Overall, my thread safety attempts result in ~4% tps degradation, compared to current implementation. Which is huge, compared to fractional improvements people spend a lot of time fighting for.

I'm not sure how to go about this at this point.

@cebtenzzre
Copy link
Collaborator

cebtenzzre commented Sep 11, 2023

I'm not sure how to go about this at this point.

Have you tried a spinlock using atomic_flag and sched_yield? Should be virtually zero overhead when there's no contention.

@staviq
Copy link
Contributor

staviq commented Sep 11, 2023

I'm not sure how to go about this at this point.

Have you tried a spinlock using atomic_flag and sched_yield?

I didn't think this would be portable, but now that you mention it, I checked, and there seems to be std::this_thread::yield in standard cpp lib.

I'll try it when I get back from work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants