-
Notifications
You must be signed in to change notification settings - Fork 1k
Adding machine ids to telemetry #3494
Adding machine ids to telemetry #3494
Conversation
src/dotnet/project.json
Outdated
"exclude": "compile" | ||
} | ||
}, | ||
"System.Net.NetworkInformation": "4.1.0-rc2-*" |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Am I really the only guy having a problem with all these crazy telemetry-related PRs? Just like command line arguments, sending machine/environment-specific things like MAC addresses sucks really hard from a privacy perspective, even if they are hashed. If you really want to be able to correlate collected traces, why not simply generating a unique identifier when installing .NET CLI and storing it somewhere in the user profile? Not perfect, but still much better than using MAC addresses for that. |
@PinpointTownes You're not the only one upset about these developments. It doesn't bother me that the program exists and provides MS with basic usage information, but it does bother me when sensitive, machine/location-identifiable information is sent to MS and the program is still being operated covertly. The combination of ...
... makes the way that the program is being managed dangerous to Microsoft's objectives. |
Just store an reuse a guid? So its specific but more anonymous |
I have to agree with the ones before me. If the only purpose really is only to correlate data, storing a guid and using that would be way more privacy friendly than using the MAC address. |
It should be OPT_IN and not OPT_OUT. See the case of the Visual C++ team. https://www.reddit.com/r/cpp/comments/4ibauu/visual_studio_adding_telemetry_function_calls_to/ Usually people takes telemetry as spying on them. |
Why does microsoft need to identify distinct machines ? Is there any documentation or discussion we can see to find out how and why this conclusion was drawn? |
@Yantrio I'd guess because ip address isn't a very good measure as you will get large activation clumping behind NATs/gateways/proxies. Much like you use a cookie in a browser rather than ip address to derive any sensible website per user usage stats. |
My question isn't so much as to why it's required to use mac addresses vs IP, it's more as to why at all? is it important to know what a specific user is doing? |
@Yantrio, as already said in the other discussions linked below, "We're only interested in aggregate data that we can use to identify trends". Aggregated data can be quite useful for many different reasons. For example, it can help engineers prioritize features that will make the product better. Also, there are several other discussions and documentation on this and other topics mentioned. |
That doesn't explain why a random identifier wouldn't work in this case.
You have GitHub, Uservoice and a bunch of other channels to hear what your community needs or wants 😄 |
That doesn't capture what people use most, only what a louder subset talks about most :) |
@PinpointTownes, Github and other community/user feedback sources are indeed leveraged quite a lot already. Though they typically convey a different set of information. It is more informative and accurate to have a complete picture from multiple sources, instead of just 1. Having just 1 source can sometimes paint an inaccurate and biased scenario.
Proof that the community feedback already influences decisions; you mentioned previously about "command line arguments". They were ultimately dropped from that earlier pull request. The decision to drop it was largely influenced because of community response.
Though it would for the core usages, it won't for all. But there are other scenarios it doesn't and that is a contributing factor why its done this specific way. For 1 example, there is interest in VsCode/CLI correlation, as they are both popular to be used together. So if we want to correlate against that already collected data, that would be 1 reason to collect a comparable value. |
Thanks everyone for the feedback. I'm closing this PR now. I'll tell you why. A little bit of context first. We added telemetry to the .NET Core Tools in RC2. I appreciated the feedback that folks made on that, which helped us to create a better product. #1 focus on the telemetry front is sharing usage data with you. We will not even consider making any more changes until that job is done. That's a promise. Now, to this change. You can see that the change came in 20 days ago. This was during the height of shipping .NET Core 1.0. As a result, the PM team was super focussed on shipping 1.0 and didn't talk to the folks working on this PR. This is a poor excuse. I know you expect more from us. With 1.0 out the door, we're going to take a closer look at our telemetry plan, both in terms of the actual telemetry and community engagement. We need telemetry for this product, however, "community engagement by PR" is not working. We need to adopt a different model for engagement. This is your .NET, and that needs to be more obvious from our engagement. I ask for your continued engagement on this topic and to extend your patience a little bit longer. Thanks everyone. |
Currently collected telemetry has no ability to accurately differentiate users or machines.
Adding a hashed machine ID to collected telemetry to improve ability to differentiate distinct machines(users)