-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
No primary server available #3634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Nothing leaps out at the moment. Are you sure none of the mongodb servers are crashing? Also, can you maintain a steady connection using the shell? |
Running the command |
Are you connecting using the same connection string, using the DNS? Also looks like your storage flat-lined after the issue, can you double check and see if you've run out of hard drive space on one of your machines? |
I wasn't using the same connection string. Do you think using the private EC2 IP addresses would resolve this? Not sure what's causing the storage to max out like that, but even after booting new instances the issue with no primary servers still occurs with plenty of space available. |
The EC2 IP addresses may help, depending on how your replica set is configured. Can you show me the output of |
This is the rs.status() while the connections are on the rise.
|
Nothing out of the ordinary in the replica set. Do you have any other relevant code samples, for instance, do you have any code that's reacting to mongoose connection events? Another potential issue worth considering, are you using an up-to-date new relic agent? I'd try running without new relic and see if this still happens, new relic monkey-patches the mongodb driver so that can sometimes lead to unexpected behavior. |
We've been outputting the mongoose connection events:
This is what some of the logs look like
I was at the mongodb days event this week, where I was able to schedule some time and show this issue to one of the senior engineers at MongoDB, and they were not sure what the issue was. They did mention to add the replication set and max pool size to the connection string, which has not resolved this issue, unfortunately. We also tried disabling the keep alive, and setting it to a smaller value on the instances, but that also did not seem to resolve this. We;re using |
Hmm yeah it does look like mongoose is reconnecting for some reason. Can you show me the output of |
$ npm list | grep "mongoose"
├─┬ [email protected] $ npm list | grep "mongo"
├─┬ [email protected]
│ ├─┬ [email protected]
│ │ ├─┬ [email protected]
├─┬ [email protected]
├─┬ [email protected]
│ ├─┬ [email protected]
│ │ ├── [email protected] |
What are you using |
Currently not using We do have |
Not that I know of. I'm just trying to see if there are other connections to mongodb that might be contributing to this issue. I've done a little googling - are you using the same DNS names for your connection string as the ones that appear in |
This error will occur when using the same DNS in the connection string as the "syncingTo" attribute in The only thing I haven't tried yet is just setting |
I'd also try running with |
We're running into the same issue. We have a site that is experiencing sustained load of of about 100 RPM with peaks in the 500-700 rpm+. It seems that we see this throughout the process even during relatively quite periods. Environment: Connection String: NPM:
Connection.js // Mongoose import
var mongoose = require('mongoose');
var options = {
server: {
socketOptions: {
keepAlive: 1,
poolSize: 10,
connectTimeoutMS: 30000,
socketTimeoutMS: 30000
}
},
replset: {
socketOptions: {
keepAlive: 1,
poolSize: 10,
connectTimeoutMS: 30000,
socketTimeoutMS: 30000
}
}
};
mongoose.connect((process.env.MONGOLAB_URI || "mongodb://localhost/test"), options, function(error) {
if (error) {
console.log(error);
}
});
module.exports = {
mongoose: mongoose
}; Logging: Message: no primary server available |
That stack trace really only tells me that 1) you're using new relic (which is very questionable, since new relic does a lot of monkey-patching of the mongodb driver), and 2) the mongodb driver thinks that there is no primary available, but I'm not sure why. Try enabling the mongodb driver's debug mode by adding var options = {
server: {
socketOptions: {
keepAlive: 1,
poolSize: 10,
connectTimeoutMS: 30000,
socketTimeoutMS: 30000
}
},
replset: {
loggerLevel: 'debug',
socketOptions: {
keepAlive: 1,
poolSize: 10,
connectTimeoutMS: 30000,
socketTimeoutMS: 30000
}
}
}; This will log a lot of driver debug data to stdout and help us figure out what's going wrong. Can you capture this data around when this "No primary server found" error occurs? |
Thanks @vkarpov15 , We have added that and will report back as soon as we have another one triggered. Cheers, |
I don't think newrelic is the problem here. We tried running without it and this issue persists. Will collect some log data from |
Thanks, let me know if you manage to catch more details on the error. |
Another data point: Mongoose triggers the "reconnected" event over and over as the connection count increases. The "no primary server available" errors usually trigger after the connection count has already begun to climb. |
We as well have experienced this issue. With have a Node app hosted on Heroku with MongoLab. |
Bump - we're seeing this on
|
@sansmischevia are you using mongolab + heroku as well? |
^ We're experiencing this problem in a large production deployment on AWS EC2 with self-hosted mongodb servers via Cloud Manager. |
Hello, We would also like to chime in.
It is often reproducible during an operation that seeds the database, involving a lot of queries. Our application seems to be unaffected after this occurs. No errors in mongo log and our three node replica set is healthy during this time. We will try |
@vkarpov15 we're on mongolab replsets + ec2 directly |
I am experiencing this issue on mongolab as well. |
@christkv I've been waiting until this happens again to send you some logs in that other ticket. Our cluster has actually been stable for the last few weeks and we have not seen this error. |
@ChrisZieba funny how that always seems to happen lol 👍 I'Il leave the ticket open in jira for now and see what we can figure out. |
@christkv Hi Christian, i'm just curious if you have any pointers on workarounds in the case of lower traffic. I was thinking of just reducing the pool size as well as increasing the timeouts. |
if it helps anyone else, I removed the socket timeout as well as increased keepAlive to 200 and also reduced the poolsize to 3.. i seem to have a lot less disconnect/reconnects.. however it does still occasionally happen. |
If it helps anyone, we removed almost all mongoose settings, including socketTimeout and connectionTimeout and keepAlive and connections started to be stable. Our poolSize is 200. mongoose v4.4.2 |
Do you have a huge amount of slow operations ? if you don't I don't think you will notice any difference between a pool of 20 sockets vs 500. |
Sorry... it's 200. Fixed the comment. And yeah, you're right. We don't sense much difference but we rather have the pool size larger than smaller. The real problem with when connections keep opening and not closed. This used to happen until we removed all mongoose timeout and keepAlive settings. I wonder why these are handled by mongoose/mongo-driver and not letting the OS do it? |
Well 2.1.7 and higher has a redesigned pool that avoid this. If you set socketTimeout 0 you delegate it to the os but that might be as much as 10 minutes of hanging connections. |
Ok. interesting. So now that I removed the keepAlive and socketTimeout settings what are the default settings? |
it depends, not sure if mongoose set any specific settings as default. if you use the MongoClient.connect method in the driver it's 30 seconds for both connect and socket timeouts. |
We do use |
Well with 500 connections you need at least 500 ops inside the socketTimeout period to keep the pool open, otherwise it will close down and force a reconnect. This changes in 2.1.7 however as the pool is a growing/shrinking model. |
I am having same issue with mongodb 3.2.6 and mongoose 4.3.4. Any help on this? |
@15astro try to remove the settings of |
@refaelos Ok..willl try that..I tried with keepAlive=6000 but that didn't help. Just wanted to know how removing |
Yeah we tried it with different values and only when we completely removed these settings things started to work well. |
@refaelos: I found no luck with removing these settings. Any other thing I am missing? |
@15astro no man. Sorry. This is how our settings looks like today:
|
In my case it was related to lack of IP to name binding in /etc/hosts. If you have set up replica set with names instead of IPs and you have something like this in /etc/hosts of MongoDB nodes:
Then you also need to put it in /etc/hosts of all your app servers. I thought that node-mongo connects according to whatever I put in the URI, but it's not the case. It seems that node-mongo connects by IP or name from Mongo URI, then gets hostnames of other replica members from the first MongoDB node that responded to request. It gets for example Hope that helps. |
@adriank yes that's correct it bases it's connections of the ones it gets back from the replicaset config. The reason is that this is the canonical source of truth about a replicaset. This is also why all addresses in the replicaset configuration must be resolvable by the driver for the driver to failover correctly and for it to be able to detect servers being added and removed from the set. Previous drivers did not implement the SDAM spec and where more lax. This however would cause problems in production environments. |
@christkv However it is a nightmare for tools like our MongoSpector. Because of it we have problems with connecting securely to more than one replica from one host. DigitalOcean auto-generates names to droplets that almost nobody changes and the effect is that many clients have |
We are tracking a server ticket here https://jira.mongodb.org/browse/SERVER-1889. I would love for something like this to be possible. We should also file a ticket with DigitalOcean pointing out the mistake they are making and how it's affecting their users. |
by the way you can remove and re-add the replicaset members with their new names being ips |
Having a similiar issue, after around 12-24hours of being connected our we get an error "No primary server available" Restarting usually fixes the issue. connection: |
We added this reduced the operations on primary and we did't get no primary found error anymore.
|
I have an issue that is rather difficult to debug, and was wondering if anyone sees anything wrong with my configuration.
Nodejs version
4.2.1
and mongoDB version3.0.7
with mongoose4.2.8
.This seems to happen randomly and will open many connection until I finally restart the node process. The cluster is healthy at all times during this error. This error happens hundreds of times per hour. There does not seem to be any consistency as to when the error will begin. For example, it occurs when the cluster is operating normally and no changes to the primary have been made.
This is what the db stats look like. As you can see the number of connections will steadily increase. If I kill the node process and start a new one everything is fine.
Config
Connection String
Stack trace
The text was updated successfully, but these errors were encountered: