redis

Commit Graph

Author	SHA1	Message	Date
antirez	0f1f25784f	Cluster: better timeout and retry time for failover. When node-timeout is too small, in the order of a few milliseconds, there is no way the voting process can terminate during that time, so we set a lower limit for the failover timeout of two seconds. The retry time is set to two times the failover timeout time, so it is at least 4 seconds.	2014-03-10 09:57:52 +01:00
Matt Stancliff	f0782a6e86	Fix key extraction for z{union,inter}store The previous implementation wasn't taking into account the storage key in position 1 being a requirement (it was only counting the source keys in positions 3 to N). Fixes antirez/redis#1581	2014-03-07 16:33:20 -05:00
antirez	6984692060	Cluster: fix conditional generating TRYAGAIN error.	2014-03-07 16:18:00 +01:00
antirez	36676c2318	Redis Cluster: support for multi-key operations.	2014-03-07 13:19:09 +01:00
Salvatore Sanfilippo	bbf39b7a3a	Merge pull request #1576 from Hailei/fix-lruidletime-comment Fix REDIS_LRU_CLOCK_MAX's value	2014-03-06 18:14:36 +01:00
antirez	b74c899da3	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2014-03-06 18:06:30 +01:00
Matt Stancliff	e8bae92e54	Reset op_sec_last_sample_ops when reset requested This value needs to be set to zero (in addition to stat_numcommands) or else people may see a negative operations per second count after they run CONFIG RESETSTAT. Fixes antirez/redis#1577	2014-03-06 18:00:08 +01:00
Matt Stancliff	385c25f70f	Remove redundant IP length definition REDIS_CLUSTER_IPLEN had the same value as REDIS_IP_STR_LEN. They were both #define'd to the same INET6_ADDRSTRLEN.	2014-03-06 17:55:43 +01:00
Matt Stancliff	d2040ab9b1	Remove some redundant code Function nodeIp2String in cluster.c is exactly anetPeerToString with a pre-extracted fd.	2014-03-06 17:55:39 +01:00
Matt Stancliff	59cf0b1902	Fix return value check for anetTcpAccept anetTcpAccept returns ANET_ERR, not AE_ERR. This isn't a physical error since both ANET_ERR and AE_ERR are -1, but better to be consistent.	2014-03-06 17:55:31 +01:00
Salvatore Sanfilippo	54e99fb226	Merge pull request #1578 from badboy/patch-5 Small typo fixed	2014-03-06 17:40:04 +01:00
antirez	9b401819c0	Cast saveparams[].seconds to long for %ld format specifier.	2014-03-05 11:26:18 +01:00
Jan-Erik Rediger	5f5118bdad	Small typo fixed	2014-03-05 00:41:02 +01:00
Matt Stancliff	e5b1e7be64	Bind source address for cluster communication The first address specified as a bind parameter (server.bindaddr[0]) gets used as the source IP for cluster communication. If no bind address is specified by the user, the behavior is unchanged. This patch allows multiple Redis Cluster instances to communicate when running on the same interface of the same host.	2014-03-04 17:36:45 -05:00
antirez	47750998a6	Sentinel: more aggressive failover start desynchronization. Sentinel needs to avoid split brain conditions due to multiple sentinels trying to get voted at the exact same time. So far some desynchronization was provided by fluctuating server.hz, that is the frequency of the timer function call. However the desynchonization provided in this way was not enough when using many Sentinel instances, especially when a large quorum value is used in order to force a greater degree of agreement (more than N/2+1). It was verified that it was likely to trigger a split brain condition, forcing the system to try again after a timeout. Usually the system will succeed after a few retries, but this is not optimal. This commit desynchronizes instances in a more effective way to make it likely that the first attempt will be successful.	2014-03-04 17:09:36 +01:00
antirez	08da025f56	CONFIG REWRITE should be logged at WARNING level.	2014-03-04 16:39:47 +01:00
zhanghailei	138695d990	refer to updateLRUClock's comment REDIS_LRU_CLOCK_MAX is 22 bits,but #define REDIS_LRU_CLOCK_MAX ((1<<21)-1) only 21 bits	2014-03-04 12:20:31 +08:00
zhanghailei	c0f8665414	FIXED a typo more thank should be more than	2014-03-04 11:21:34 +08:00
zhanghailei	4b9ac6edd0	According to context,the size should be 16 rather than 64	2014-03-04 11:21:34 +08:00
antirez	c5edd91716	Cluster: invalidate current transaction on redirections.	2014-03-03 17:11:51 +01:00
antirez	e41a3edfab	Merge branch 'cli_improved_bigkeys' of git://github.com/michael-grunder/redis into unstable	2014-03-03 11:20:54 +01:00
antirez	12a88d575d	Document why we update peak memory in INFO.	2014-03-03 11:19:54 +01:00
antirez	0c1bb1313c	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2014-03-03 11:17:37 +01:00
antirez	8dea2029a4	Fix configEpoch assignment when a cluster slot gets "closed". This is still code to rework in order to use agreement to obtain a new configEpoch when a slot is migrated, however this commit handles the special case that happens when the nodes are just started and everybody has a configEpoch of 0. In this special condition to have the maximum configEpoch is not enough as the special epoch 0 is not unique (all the others are). This does not fixes the intrinsic race condition of a failover happening while we are resharding, that will be addressed later.	2014-03-03 11:12:11 +01:00
Matt Stancliff	f1c9a203b2	Force INFO used_memory_peak to match peak memory used_memory_peak only updates in serverCron every server.hz, but Redis can use more memory and a user can request memory INFO before used_memory_peak gets updated in the next cron run. This patch updates used_memory_peak to the current memory usage if the current memory usage is higher than the recorded used_memory_peak value. (And it only calls zmalloc_used_memory() once instead of twice as it was doing before.)	2014-02-28 17:47:41 -05:00
antirez	a89c8bb87c	Sentinel test: Makefile target added.	2014-02-28 16:00:00 +01:00
michael-grunder	806788d009	Improved bigkeys with progress, pipelining and summary This commit reworks the redis-cli --bigkeys command to provide more information about our progress as well as output summary information when we're done. - We now show an approximate percentage completion as we go - Hiredis pipelining is used for TYPE and SIZE retreival - A summary of keyspace distribution and overall breakout at the end	2014-02-27 12:01:57 -08:00
antirez	76a6e82d89	warnigns -> warnings in redisBitpos().	2014-02-27 13:17:23 +01:00
antirez	0e31eaa27f	More consistent BITPOS behavior with bit=0 and ranges. With the new behavior it is possible to specify just the start in the range (the end will be assumed to be the first byte), or it is possible to specify both start and end. This is useful to change the behavior of the command when looking for zeros inside a string. 1) If the user specifies both start and end, and no 0 is found inside the range, the command returns -1. 2) If instead no range is specified, or just the start is given, even if in the actual string no 0 bit is found, the command returns the first bit on the right after the end of the string. So for example if the string stored at key foo is "\xff\xff": BITPOS foo (returns 16) BITPOS foo 0 -1 (returns -1) BITPOS foo 0 (returns 16) The idea is that when no end is given the user is just looking for the first bit that is zero and can be set to 1 with SETBIT, as it is "available". Instead when a specific range is given, we just look for a zero within the boundaries of the range.	2014-02-27 12:53:03 +01:00
antirez	38c620b3b5	Initial implementation of BITPOS. It appears to work but more stress testing, and both unit tests and fuzzy testing, is needed in order to ensure the implementation is sane.	2014-02-27 12:44:27 +01:00
antirez	addd4de9c1	Merge branch 'unstable' of github.com:/antirez/redis into unstable	2014-02-27 10:14:03 +01:00
antirez	746ce35f5f	Fix misaligned word access in redisPopcount().	2014-02-27 09:46:20 +01:00
Matt Stancliff	d769cad4bf	Fix IP representation in clusterMsgDataGossip	2014-02-25 16:02:28 -05:00
antirez	55e36e1132	Merge branch 'bigkeys_scan' of git://github.com/michael-grunder/redis into unstable	2014-02-25 14:59:57 +01:00
michael-grunder	013a4ce242	Update --bigkeys to use SCAN This commit changes the findBigKeys() function in redis-cli.c to use the new SCAN command for iterating the keyspace, rather than RANDOMKEY. Because we can know when we're done using SCAN, it will exit after exhausting the keyspace.	2014-02-25 05:41:30 -08:00
antirez	a2c76ffb1c	redis-cli: also remove useless uint8_t.	2014-02-25 13:47:37 +01:00
antirez	ba993cc685	redis-cli: don't use uint64_t where actually not needed. The computation is just something to take the CPU busy, no need to use a specific type. Since stdint.h was not included this prevented compilation on certain systems.	2014-02-25 13:44:31 +01:00
antirez	5580350a7b	redis-cli: check argument existence for --pattern.	2014-02-25 12:38:29 +01:00
antirez	c1d67ea9b4	redis-cli: --intrinsic-latency run mode added.	2014-02-25 12:37:52 +01:00
antirez	dcac007b81	redis-cli: added comments to split program in parts.	2014-02-25 12:24:45 +01:00
antirez	b15411df98	Sentinel: log quorum with +monitor event.	2014-02-24 17:10:20 +01:00
antirez	6b373edb77	Sentinel: generate +monitor events at startup.	2014-02-24 16:33:55 +01:00
antirez	3b7a757468	Sentinel: log +monitor and +set events. Now that we have a runtime configuration system, it is very important to be able to log how the Sentinel configuration changes over time because of API calls.	2014-02-24 16:33:43 +01:00
antirez	25cebf7285	Sentinel: added missing exit(1) after checking for config file.	2014-02-24 16:22:52 +01:00
Salvatore Sanfilippo	e163332858	Merge pull request #1545 from mattsta/fix-redis-cli-sync Deny SYNC and PSYNC in redis-cli	2014-02-23 17:47:28 +01:00
antirez	b1c1386374	Sentinel: IDONTKNOW error removed. This error was conceived for the older version of Sentinel that worked via master redirection and that was not able to get configuration updates from other Sentinels via the Pub/Sub channel of masters or slaves. This reply does not make sense today, every Sentinel should reply with the best information it has currently. The error will make even more sense in the future since the plan is to allow Sentinels to update the configuration of other Sentinels via gossip with a direct chat without the prerequisite that they have at least a monitored instance in common.	2014-02-22 17:34:46 +01:00
Matt Stancliff	2c273e3591	Add cluster or sentinel to proc title If you launch redis with `redis-server --sentinel` then in a ps, your output only says "redis-server IP:Port" — this patch changes the proc title to include [sentinel] or [cluster] depending on the current server mode: e.g. "redis-server IP:Port [sentinel]" "redis-server IP:Port [cluster]"	2014-02-20 23:58:54 -05:00
antirez	7d7b3810e7	Sentinel: report instances role switch events. This is useful mostly for debugging of issues.	2014-02-20 12:13:52 +01:00
Matt Stancliff	ce68caea37	Cluster: error out quicker if port is unusable The default cluster control port is 10,000 ports higher than the base Redis port. If Redis is started on a too-high port, Cluster can't start and everything will exit later anyway.	2014-02-19 17:30:07 -05:00
Matt Stancliff	b20ae393f1	Fix "can't bind to address" error reporting. Report the actual port used for the listening attempt instead of server.port. Originally, Redis would just listen on server.port. But, with clustering, Redis uses a Cluster Port too, so we can't say server.port is always where we are listening. If you tried to launch Redis with a too-high port number (any port where Port+10000 > 65535), Redis would refuse to start, but only print an error saying it can't connect to the Redis port. This patch fixes much confusions.	2014-02-19 17:26:33 -05:00
antirez	7cec9e48ce	Sentinel: SENTINEL_SLAVE_RECONF_RETRY_PERIOD -> RECONF_TIMEOUT Rename define to match the new meaning.	2014-02-18 10:27:38 +01:00
antirez	18b8bad53c	Sentinel: fix slave promotion timeout. If we can't reconfigure a slave in time during failover, go forward as anyway the slave will be fixed by Sentinels in the future, once they detect it is misconfigured. Otherwise a failover in progress may never terminate if for some reason the slave is uncapable to sync with the master while at the same time it is not disconnected.	2014-02-18 08:50:57 +01:00
antirez	ede33fb912	Get absoulte config file path before processig 'dir'. The code tried to obtain the configuration file absolute path after processing the configuration file. However if config file was a relative path and a "dir" statement was processed reading the config, the absolute path obtained was wrong. With this fix the absolute path is obtained before processing the configuration while the server is still in the original directory where it was executed.	2014-02-17 16:44:53 +01:00
antirez	e1b77b61f3	Sentinel: better specify startup errors due to config file. Now it logs the file name if it is not accessible. Also there is a different error for the missing config file case, and for the non writable file case.	2014-02-17 16:44:49 +01:00
antirez	51bd9da1fd	Update cached time in rdbLoad() callback. server.unixtime and server.mstime are cached less precise timestamps that we use every time we don't need an accurate time representation and a syscall would be too slow for the number of calls we require. Such an example is the initialization and update process of the last interaction time with the client, that is used for timeouts. However rdbLoad() can take some time to load the DB, but at the same time it did not updated the time during DB loading. This resulted in the bug described in issue #1535, where in the replication process the slave loads the DB, creates the redisClient representation of its master, but the timestamp is so old that the master, under certain conditions, is sensed as already "timed out". Thanks to @yoav-steinberg and Redis Labs Inc for the bug report and analysis.	2014-02-13 15:13:26 +01:00
antirez	7e8abcf693	Log when CONFIG REWRITE goes bad.	2014-02-13 14:32:44 +01:00
antirez	21e6b0fbe9	Fix script cache bug in the scripting engine. This commit fixes a serious Lua scripting replication issue, described by Github issue #1549. The root cause of the problem is that scripts were put inside the script cache, assuming that slaves and AOF already contained it, even if the scripts sometimes produced no changes in the data set, and were not actaully propagated to AOF/slaves. Example: eval "if tonumber(KEYS[1]) > 0 then redis.call('incr', 'x') end" 1 0 Then: evalsha <sha1 step 1 script> 1 0 At this step sha1 of the script is added to the replication script cache (the script is marked as known to the slaves) and EVALSHA command is transformed to EVAL. However it is not dirty (there is no changes to db), so it is not propagated to the slaves. Then the script is called again: evalsha <sha1 step 1 script> 1 1 At this step master checks that the script already exists in the replication script cache and doesn't transform it to EVAL command. It is dirty and propagated to the slaves, but they fail to evaluate the script as they don't have it in the script cache. The fix is trivial and just uses the new API to force the propagation of the executed command regardless of the dirty state of the data set. Thank you to @minus-infinity on Github for finding the issue, understanding the root cause, and fixing it.	2014-02-13 12:10:43 +01:00
antirez	fc08c8599f	AOF write error: retry with a frequency of 1 hz.	2014-02-12 16:27:59 +01:00
antirez	fe8352540f	AOF: don't abort on write errors unless fsync is 'always'. A system similar to the RDB write error handling is used, in which when we can't write to the AOF file, writes are no longer accepted until we are able to write again. For fsync == always we still abort on errors since there is currently no easy way to avoid replying with success to the user otherwise, and this would violate the contract with the user of only acknowledging data already secured on disk.	2014-02-12 16:11:36 +01:00
antirez	db6d628c3e	Cluster: clusterDelNode(): remove node from master's slaves.	2014-02-11 10:34:25 +01:00
antirez	5e0e03be41	Cluster: UPDATE messages are the norm and verbose. Logging them at WARNING level was of little utility and of sure disturb.	2014-02-11 10:18:24 +01:00
antirez	8251d2d150	Cluster: redis-trib fix: handling of another trivial case.	2014-02-11 10:13:18 +01:00
antirez	4a64286c36	Cluster: configEpoch assignment in SETNODE improved. Avoid to trash a configEpoch for every slot migrated if this node has already the max configEpoch across the cluster. Still work to do in this area but this avoids both ending with a very high configEpoch without any reason and to flood the system with fsyncs.	2014-02-11 10:09:17 +01:00
antirez	72f7abf6a2	Cluster: clusterSetStartupEpoch() made more generally useful. The actual goal of the function was to get the max configEpoch found in the cluster, so make it general by removing the assignment of the max epoch to currentEpoch that is useful only at startup.	2014-02-11 10:00:14 +01:00
antirez	44f7afe28a	Cluster: always increment the configEpoch in SETNODE after import. Removed a stale conditional preventing the configEpoch from incrementing after the import in certain conditions. Since the master got a new slot it should always claim a new configuration.	2014-02-11 09:50:37 +01:00
antirez	a1349728ea	Cluster: on resharding upgrade version of receiving node. The node receiving the hash slot needs to have a version that wins over the other versions in order to force the ownership of the slot. However the current code is far from perfect since a failover can happen during the manual resharding. The fix is a work in progress but the bottom line is that the new version must either be voted as usually, set by redis-trib manually after it makes sure can't be used by other nodes, or reserved configEpochs could be used for manual operations (for example odd versions could be never used by slaves and are always used by CLUSTER SETSLOT NODE).	2014-02-11 00:36:05 +01:00
antirez	6dc26795aa	Cluster: fsync at every SETSLOT command puts too pressure on disks. During slots migration redis-trib can send a number of SETSLOT commands. Fsyncing every time is a bit too much in production as verified empirically. To make sure configs are fsynced on all nodes after a resharding redis-trib may send something like CLUSTER CONFSYNC. In this case fsyncs were not providing too much value since anyway processes can crash in the middle of the resharding of an hash slot, and redis-trib should be able to recover from this condition anyway.	2014-02-10 23:54:08 +01:00
antirez	218358bbbd	Cluster: conditions to clear "migrating" on slot for SETSLOT ... NODE changed. If the slot is manually assigned to another node, clear the migrating status regardless of the fact it was previously assigned to us or not, as long as we no longer have keys for this slot. This avoid a race during slots migration that may leave the slot in migrating status in the source node, since it received an update message from the destination node that is already claiming the slot. This way we are sure that redis-trib at the end of the slot migration is always able to close the slot correctly.	2014-02-10 23:51:47 +01:00
antirez	3107e7ca60	Cluster: remove debugging xputs from redis-trib.	2014-02-10 19:14:05 +01:00
antirez	1ae50a9b1d	Cluster: redis-trib fix: cover new case of open slot. The case is the trivial one a single node claiming the slot as migrating, without nodes claiming it as importing.	2014-02-10 19:10:23 +01:00
antirez	59e03a8f35	redis-trib: log event after we have reference to 'master'.	2014-02-10 18:48:40 +01:00
antirez	bf670e0745	Cluster: don't update slave's master if we don't know it. There is no way we can update the slave's node->slaveof pointer if we don't know the master (no node with such an ID in our tables).	2014-02-10 18:33:34 +01:00
antirez	a3755ae9ee	Cluster: ignore slot config changes if we are importing it.	2014-02-10 18:04:43 +01:00
antirez	6fc53e16ad	Cluster: update configEpoch after manually messing with slots.	2014-02-10 18:01:58 +01:00
antirez	be0bb19fd3	Cluster: redis-trib, more info about open slots error.	2014-02-10 17:44:16 +01:00
antirez	1a73c992a3	Cluster: fixed inverted arguments in logging function call.	2014-02-10 17:21:10 +01:00
antirez	32563b4a5f	Cluster: clear the FAIL status for masters without slots. Masters without slots don't participate to the cluster but just do redirections, no need to take them in FAIL state if they are back reachable.	2014-02-10 17:18:27 +01:00
Matt Stancliff	21648473aa	Auto-enter slaveMode when SYNC from redis-cli If someone asks for SYNC or PSYNC from redis-cli, automatically enter slaveMode (as if they ran redis-cli --slave) and continue printing the replication stream until either they Ctrl-C or the master gets disconnected.	2014-02-10 11:10:31 -05:00
antirez	5b2082ead3	Cluster: replica migration should only work for masters serving slots.	2014-02-10 17:08:37 +01:00
antirez	f106a79309	Cluster: redis-trib del-node variable typo fixed.	2014-02-10 16:59:09 +01:00
antirez	f885fa8bac	Cluster: clusterReadHandler() fixed to work with new message header.	2014-02-10 16:27:37 +01:00
antirez	344a065d51	Cluster: don't propagate PUBLISH two times. PUBLISH both published messages via Cluster bus and replication when cluster was enabled, resulting in duplicated message in the slave.	2014-02-10 16:00:27 +01:00
antirez	7bf7b7350c	Cluster: signature changed to "RCmb" (Redis Cluster message bus). Sounds better after all.	2014-02-10 15:55:21 +01:00
antirez	dced9c0619	Cluster: discard bus messages with version != 0.	2014-02-10 15:54:22 +01:00
antirez	007e1c7cb2	Cluster: added signature + version in bus packets.	2014-02-10 15:53:09 +01:00
antirez	dca95f241c	Cluster: redis-trib: options table entry for add-node fixed.	2014-02-10 12:34:21 +01:00
antirez	6df4ffe639	Don't count time to feed MONITORs in SLOWLOG.	2014-02-07 18:29:20 +01:00
antirez	142281dc79	Cluster: keys slot computation now supports hash tags. Currently this is marginally useful, only to make sure two keys are in the same hash slot when the cluster is stable (no rehashing in progress). In the future it is possible that support will be added to run mutli-keys operations with keys in the same hash slot.	2014-02-07 17:39:01 +01:00
antirez	2d6eb68993	Sentinel: allow SHUTDOWN command in Sentinel mode.	2014-02-07 11:22:24 +01:00
antirez	970de3e9c0	Check for EAGAIN in sendBulkToSlave(). Sometime an osx master with a Linux server over a slow link caused a strange error where osx called the writable function for the socket but actually apparently there was no room in the socket buffer to accept the write: write(2) call returned an EAGAIN error, that was not checked, so we considered write(2) == 0 always as a connection reset, which was unfortunate since the bulk transfer has to start again. Also more errors are logged with the WARNING level in the same code path now.	2014-02-05 16:38:10 +01:00
antirez	04fe000bf8	Cluster: fixed MF condition in clusterHandleSlaveFailover(). For manual failover we need a manual failover in progress, and that mf_can_start is true (master offset received and matched).	2014-02-05 16:01:56 +01:00
antirez	c6f02fd67a	Cluster: CLUSTER FAILOVER replies with OK and logs the event.	2014-02-05 15:52:38 +01:00
antirez	c72449af30	Cluster: check that a MF is in progress in manualFailoverCheckTimeout(). Otherwise it is always detected as a manual failover timed out.	2014-02-05 15:45:24 +01:00
antirez	b7402bcad5	Cluster: force AUTH ACK on manual failover. When a slave requests masters vote for a manual failover, the REQUEST_AUTH message is flagged in a special way in order to force the masters to give the authorization even if the master is not marked as failing.	2014-02-05 13:10:03 +01:00
antirez	4cf0cd5719	Cluster: manual failover initial implementation.	2014-02-05 13:01:24 +01:00
antirez	4919a13f50	CLIENT PAUSE and related API implemented. The API is one of the bulding blocks of CLUSTER FAILOVER command that executes a manual failover in Redis Cluster. However exposed as a command that the user can call directly, it makes much simpler to upgrade a standalone Redis instance using a slave in a safer way. The commands works like that: CLIENT PAUSE <milliesconds> All the clients that are not slaves and not in MONITOR state are paused for the specified number of milliesconds. This means that slaves are normally served in the meantime. At the end of the specified amount of time all the clients are unblocked and will continue operations normally. This command has no effects on the population of the slow log, since clients are not blocked in the middle of operations but only when there is to process new data. Note that while the clients are unblocked, still new commands are accepted and queued in the client buffer, so clients will likely not block while writing to the server while the pause is active.	2014-02-04 16:16:09 +01:00
antirez	b089ba98cc	Scripting: expire keys in scripts only at first access. Keys expiring in the middle of the execution of Lua scripts are to create inconsistencies in masters and / or AOF files. See the following example: if redis.call("exists",KEYS[1]) == 1 then redis.call("incr","mycounter") end if redis.call("exists",KEYS[1]) == 1 then return redis.call("incr","mycounter") end The script executes two times the same if key exists then incrementcounter logic. However the two executions will work differently in the master and the slaves, provided some unlucky timing happens. In the master the first time the key may still exist, while the second time the key may no longer exist. This will result in the key incremented just one time. However as a side effect the master will generate a synthetic `DEL` command in the replication channel in order to force the slaves to expire the key (given that key expiration is master-driven). When the same script will run in the slave, the key will no longer be there, so the script will not increment the key. The key idea used to implement the expire-at-first-lookup semantics was provided by Marc Gravell.	2014-02-03 16:15:53 +01:00
antirez	b770079f2c	Allow CONFIG and SHUTDOWN while in stale-slave state.	2014-02-03 15:51:03 +01:00
antirez	89884e8f6e	Scripting: use mstime() and mstime_t for lua_time_start. server.lua_time_start is expressed in milliseconds. Use mstime_t instead of long long, and populate it with mstime() instead of ustime()/1000. Functionally identical but more natural.	2014-02-03 15:45:40 +01:00
antirez	7be946fde2	Option "backlog" renamed "tcp-backlog". This is especially important since we already have a concept of backlog (the replication backlog).	2014-01-31 14:56:10 +01:00
Nenad Merdanovic	d76aa96d1a	Add support for listen(2) backlog definition In high RPS environments, the default listen backlog is not sufficient, so giving users the power to configure it is the right approach, especially since it requires only minor modifications to the code.	2014-01-31 14:52:10 +01:00
antirez	a7d30681c9	Cluster: configurable replicas migration barrier. It is possible to configure the min number of additional working slaves a master should be left with, for a slave to migrate to an orphaned master.	2014-01-31 11:26:36 +01:00
antirez	3ff1bb4b2e	Sentinel: check arity for SENTINEL MASTER command. This fixes issue #1530.	2014-01-31 10:13:38 +01:00
antirez	6c9359add1	Cluster: perform orphaned masters check before continue statements. The check was placed in a way that conflicted with the continue statements used by the node hearth beat code later that needs to skip the current node sometimes. Moved at the start of the function so that's always executed.	2014-01-30 18:23:31 +01:00
antirez	c2507b0ff6	Cluster: replica migration implementation. This feature allows slaves to migrate to orphaned masters (masters without working slaves), as long as a set of conditions are met, including the fact that the migrating slave needs to be in a master-slaves ring with at least another slave working.	2014-01-30 18:05:11 +01:00
antirez	5b4020fb42	Cluster: swap two code blocks to have a more obvious flow.	2014-01-30 16:34:23 +01:00
antirez	4beaaff8ea	Cluster: remove not needed return statement breaking failover.	2014-01-29 17:28:46 +01:00
antirez	3582054982	Cluster: broadcast pong to other slaves in the same ring. When we schedule a failover, broadcast a PONG to the slaves. The other slaves that plan to get elected will do the same too, this way it is likely that every slave will have a good picture of its own rank. Note that this is N*N messages where N is the number of slaves for the failing master, however usually even large clusters have many master nodes but a limited number of replicas per node, so this is harmless.	2014-01-29 17:19:55 +01:00
antirez	e2b59621a8	Cluster: log offset when announcing the failover election delay.	2014-01-29 17:16:10 +01:00
antirez	940531e9b7	Cluster: added progressive election delay according to slave rank. Note that when we compute the initial delay, there are probably still more up to date information to receive from slaves with new offsets, so the delay is recomputed when new data is available.	2014-01-29 16:53:45 +01:00
antirez	6f54032080	Cluster: function clusterGetSlaveRank() added. Return the number of slaves for the same master having a better replication offset of the current slave, that is, the slave "rank" used to pick a delay before the request for election.	2014-01-29 16:39:04 +01:00
antirez	40cd38f0c4	Cluster: update node replication offset from bus packets headers.	2014-01-29 16:01:00 +01:00
antirez	9d4ded7ec6	Cluster: refactoring: new macros to check node flags.	2014-01-29 12:17:16 +01:00
antirez	099bd336db	Cluster: use myself instead of server->cluster.myself.	2014-01-29 11:38:14 +01:00
antirez	e36bd8b43e	Cluster: added a global myself pointer in cluster.c. Accessing to the 'myself' node, the node representing the currently running instance, is handy without the need to type server.cluster->myself every time.	2014-01-29 11:22:22 +01:00
antirez	f1e09d8c41	Cluster: clusterBroadcastPong() improved with target selection. Now we can broadcast a pong to all the instances or just the local slaves (that is useful for replication offset propagation).	2014-01-29 11:08:52 +01:00
antirez	befcf6259e	Cluster: broadcast master/slave replication offset in bus header.	2014-01-28 16:51:50 +01:00
antirez	8b32bd483a	Cluster: limit cluster.h to 80 cols.	2014-01-28 16:34:23 +01:00
antirez	0b1b25c51c	Cluster: introduced repl_offset fields in clusterNode. The two fields are used in order to remember the latest known replication offset and the time we received it from other slave nodes. This will be used by slaves in order to start the election procedure with a delay that is proportional to the rank of the slave among the other slaves for this master, when sorted for replication offset. Usually this allows the slave with the most updated offset to win the election and replace the failing master in the cluster.	2014-01-28 16:28:07 +01:00
antirez	72f1715e45	Fixed inverted if condition in MISCONF error code path.	2014-01-28 10:11:12 +01:00
antirez	23f4e9f0d9	Don't log MONITOR clients as disconnecting slaves.	2014-01-25 11:53:53 +01:00
antirez	40377fa522	Cluster: redis-trib set-timeout implemented.	2014-01-24 15:06:01 +01:00
antirez	0f9422d575	Cluster: update slaves lists in clusterSetMaster().	2014-01-22 18:46:53 +01:00
antirez	5383ab0bc6	Cluster: CLUSTER SLAVES subcommand added.	2014-01-22 18:38:42 +01:00
antirez	603e480fd5	Cluster: clusterGenNodesDescription() refactored into two functions.	2014-01-22 18:36:12 +01:00
antirez	1cf532dc37	redis-cli --help output improved with --scan and periods.	2014-01-22 12:07:42 +01:00
antirez	994c5b26dd	redis-cli: support for --scan option.	2014-01-22 12:04:08 +01:00
antirez	172f14d48c	Use fflush() before fsync() in rio.c. Incremental flushing in rio.c is only used to avoid huge kernel buffers synched to slow disks creating big latency spikes, so this fix has no durability implications, however it is certainly more correct to make sure that the FILE buffers are flushed to the kernel before calling fsync on the file descriptor. Thanks to Li Shao Kai for reporting this issue in the Redis mailing list.	2014-01-22 09:54:55 +01:00
antirez	80e80668f4	Cluster: master nodes wait before rejoining the cluster after reboot. One of the simple heuristics used by Redis Cluster in order to avoid losing data in the typical failure modes created by the asynchronous replication with the slaves (a master is unable, when accepting a write, to immediately tell if it should be really accepted or refused because of a configuration change), is to wait some time before to rejoin the cluster after being partitioned away from the majority of instances. A similar condition happens when a master is restarted. It does not know if it was already failed over, nor if all the clients have already an updated configuration about the cluster map, so it is possible that clients will try to write to stale masters that were restarted. In a similar way this commit changes masters behavior so they wait 2000 milliseconds before accepting writes after a reboot. There is nothing special about 2 seconds if not to be a value supposedly larger a few orders of magnitude compared to the cluster bus communication latencies.	2014-01-20 11:52:52 +01:00
antirez	e6970e204f	Cluster: debug printf statemets removed. These were committed for error after being inserted in order to fix an issue.	2014-01-20 11:19:04 +01:00
antirez	e4a5605c9a	Cluster: don't rewrite slaveof config directive in cluster mode.	2014-01-20 11:10:42 +01:00
antirez	437fc2cb56	Cluster: fix error reporting when slaveof is found in config.	2014-01-20 11:08:14 +01:00
antirez	ac3850cabd	Cluster: allow CLUSTER REPLICATE to switch master. The code was doing checks for slaves that should be done only when the instance is currently a master. Switching a slave from a master to another one should just work.	2014-01-17 18:22:35 +01:00
antirez	abd6308d27	Set server.repl_down_since to 0 when changing master. When an instance is potentially set to replicate with another master, it is conceptually disconnected forever, since we have no old copy of the dataset for this master in memory.	2014-01-17 18:20:31 +01:00
antirez	36c24bcca0	Cluster: redis-trib shows number of replicas of masters.	2014-01-17 17:56:45 +01:00
antirez	27ed9da383	Cluster: redis-trib help output format modified.	2014-01-17 12:32:49 +01:00
antirez	a68c9ba97e	Cluster: redis-trib shows what a slave replicates + fixes. Also the :replicates info field in the node object is now correctly populated. This also fixes the :replicas field computation.	2014-01-17 12:06:18 +01:00
antirez	b451176734	Cluster: redis-trib addnode is now able to add replicas.	2014-01-17 11:48:42 +01:00
antirez	30d9c1dc32	Cluster: fix redis-trib help subcommand.	2014-01-17 10:29:40 +01:00
antirez	17d0c3e85a	Cluster: redis-trib delnode implementation.	2014-01-16 18:22:03 +01:00
antirez	3d455393a6	Cluster: don't let a node forget its own master. redis-trib should make sure to reconfigure slaves of a node to remove from the cluster to replicate with other nodes before sending CLUSTER FORGET.	2014-01-16 17:49:35 +01:00
antirez	9531c84807	Cluster: redis-trib help output improved. Show options if any. Clarify that for some command any node address is ok.	2014-01-16 16:23:33 +01:00
antirez	0c373207fa	Cluster: don't forget yourself with CLUSTER FORGET.	2014-01-16 09:46:23 +01:00
antirez	3e948970fe	Cluster: use the node blacklist in CLUSTER FORGET. CLUSTER FORGET is not useful if we can't remove a node from all the nodes of our cluster because of the Gossip protocol that keeps adding a given node to nodes where we already tried to remove it. So now CLUSTER FORGET implements a nodes blacklist that is set and checked by the Gossip section processing function. This way before a node is re-added at least 60 seconds must elapse since the FORGET execution. This means that redis-trib has some time to remove a node from a whole cluster. It is possible that in the future it will be uesful to raise the 60 sec figure to something bigger.	2014-01-15 16:50:45 +01:00
antirez	ccf268fa17	Cluster: fix clusterBlacklistAddNode() by setting right expire time. The hash table value should be set to now + 60 seconds otherwise it expires immediately.	2014-01-15 16:49:31 +01:00
antirez	4e1861155f	Cluster: clusterBlacklistAddNode() key lookup fixed. We can't lookup by node->name that's not an SDS string but a plain C array in the node structure.	2014-01-15 16:45:07 +01:00
antirez	b51be7b34f	Cluster: clusterBlacklistExists() requires blacklist cleanup before lookup.	2014-01-15 16:06:54 +01:00
antirez	a81340abaf	Cluster: set a minimum rejoin delay if node_timeout is too small. The rejoin delay usually is the node timeout. However if the node timeout is too small, we set it to 500 milliseconds, that is a value chosen to be greater than most setups RTT / instances latency figures so that likely communication with other nodes happen before rejoining.	2014-01-15 12:34:33 +01:00
antirez	a687cbc19c	Cluster: periodically call clusterUpdateState() when cluster is down. Usually we update the cluster state (to understand if we should accept queries or reply with an error) only when there is a change in the state of the nodes. However for the "delayed rejoin" feature to work, that is, for a master to wait some time before accepting queries again after it rejoins the majority, we need to periodically update the last time when the node was partitioned away from the majority. With this commit if the cluster is down we update the state ten times per second.	2014-01-15 12:26:12 +01:00
antirez	25ddefdea3	Cluster: range checking in getSlotOrReply() fixed. See issue #1426 on Github.	2014-01-15 11:33:46 +01:00
antirez	fb659cd334	Cluster: ignore empty lines in nodes.conf. Even without the user messing manually with the file, it is still possible to have blank lines (just a single "\n" per line) because of how the nodes.conf update/write process works.	2014-01-15 11:23:41 +01:00
antirez	6c63df3031	Cluster: atomic update of nodes.conf file. The way the file was generated was unsafe and leaded to nodes.conf file corruption (zero length file) on server stop/crash during the creation of the file. The previous file update method was as simple as open with O_TRUNC followed by the write call. While the write call was a single one with the full payload, ensuring no half-written files for POSIX semantics, stopping the server just after the open call resulted into a zero-length file (all the nodes information lost!).	2014-01-15 10:31:20 +01:00
antirez	28273394cb	Cluster: support to read from slave nodes. A client can enter a special cluster read-only mode using the READONLY command: if the client read from a slave instance after this command, for slots that are actually served by the instance's master, the queries will be processed without redirection, allowing clients to read from slaves (but without any kind fo read-after-write guarantee). The READWRITE command can be used in order to exit the readonly state.	2014-01-14 16:33:16 +01:00
antirez	aacbba2607	Fix typo in aofRewriteBufferAppend() comment.	2014-01-14 15:37:49 +01:00
antirez	5189485625	Set REDIS_AOF_REWRITE_MIN_SIZE to 64mb. 64mb is the default value in redis.conf. For some reason instead the hard-coded default was 1mb that is too small.	2014-01-14 11:27:28 +01:00
antirez	d5763dceaf	SENTINEL SET master quorum implemented.	2014-01-14 09:23:26 +01:00
antirez	fe86f890b0	SENTINEL SET: error on bad option name + flush config on error.	2014-01-13 11:55:57 +01:00
antirez	f822516e43	SENTINEL SET implemented. The new command allows to change master-specific configurations at runtime. All the settable parameters can be retrivied via the SENTINEL MASTER command, so there is no equivalent "GET" command.	2014-01-13 11:53:29 +01:00
antirez	3cdcaff069	Sentinel: fix wrong arity error message.	2014-01-13 11:05:13 +01:00
antirez	964f6b17e9	Sentinel: SENTINEL REMOVE command added. The command totally removes a monitored master.	2014-01-10 15:39:36 +01:00
antirez	cf2835519e	Sentinel: releaseSentinelRedisInstance() top comment fixed. The claim about unlinking the instance from the connected hash tables was the opposite of the reality. Also the current actual behavior is safer in most cases, so it is better to manually unlink when needed.	2014-01-10 15:33:42 +01:00
antirez	9d0f46c6f5	Sentinel: flush config on disk when new master is added.	2014-01-10 15:22:06 +01:00
antirez	d4f296bc1d	anetResolveIP() prototype added to anet.h.	2014-01-10 15:18:41 +01:00
antirez	39f9f449b0	Sentinel: SENTINEL MONITOR command implemented. It allows to add new masters to monitor at runtime.	2014-01-10 15:18:24 +01:00
antirez	774f0bd45e	anetResolveIP() added to anet.c. The new function is used when we want to normalize an IP address without performing a DNS lookup if the string to resolve is not a valid IP. This is useful every time only IPs are valid inputs or when we want to skip DNS resolution that is slow during runtime operations if we are required to block.	2014-01-10 15:02:39 +01:00
antirez	c42e4bd0b6	Sentinel: added SENTINEL MASTER <name> command. With SENTINEL MASTERS it was already possible to list all the configured masters, but not a specific one.	2014-01-10 14:41:52 +01:00
antirez	2bb9cd464e	Add all the configurable fields to addReplySentinelRedisInstance(). Note: the auth password with the master is voluntarily not exposed.	2014-01-10 14:31:41 +01:00
antirez	5a7d04ee7b	Trip comment to 80 cols in SentinelCommand().	2014-01-10 14:13:04 +01:00
antirez	58c8a071a5	Fix RESTORE ttl handling in 32 bit archs. long was used instead of long long in order to handle a 64 bit resolution millisecond timestamp. This fixes issue #1483.	2014-01-09 11:09:23 +01:00
antirez	e1ab2991c3	Fix keyspace events flags-to-string conversion. Fixes issue #1491 on Github.	2014-01-08 17:18:34 +01:00
antirez	90a81b4ebb	Don't send REPLCONF ACK to old masters. Masters not understanding REPLCONF ACK will reply with errors to our requests causing a number of possible issues. This commit detects a global replication offest set to -1 at the end of the replication, and marks the client representing the master with the REDIS_PRE_PSYNC flag. Note that this flag was called REDIS_PRE_PSYNC_SLAVE but now it is just REDIS_PRE_PSYNC as it is used for both slaves and masters starting with this commit. This commit fixes issue #1488.	2014-01-08 14:28:16 +01:00
antirez	3f92e05637	Clarify a comment in slaveTryPartialResynchronization().	2014-01-08 14:28:13 +01:00
antirez	fdf50e1e3d	Log disconnection with slave only when ip:port is available.	2013-12-25 18:41:53 +01:00
antirez	2041882286	anetPeerToString / SockName: port can be NULL on errors too.	2013-12-25 18:41:49 +01:00
antirez	a2a900356e	anetTcpGenericConnect() bug introduced in `9d19977` fixed. Durign a refactoring I mispelled _port for port. This is one of the reasons I never used _varname myself.	2013-12-25 18:41:45 +01:00
antirez	cb23d510f4	Remove useless goto from anetTcpGenericConnect().	2013-12-25 18:41:41 +01:00
antirez	491f681088	anetTcpGenericConnect() code improved + 1 bug fix. Now the socket is closed if anetNonBlock() fails, and in general the code structure makes it harder to introduce this kind of bugs in the future. Reference: pull request #1059.	2013-12-25 18:15:28 +01:00
antirez	f510549044	Cluster: clusterProcessPacket() was not 80 cols friendly. The function actually needs to be split into sub-functions at some point in the future.	2013-12-25 17:57:36 +01:00
antirez	e789384255	Fix CONFIG REWRITE handling of unknown options. There were two problems with the implementation. 1) "save" was not correctly processed when no save point was configured, as reported in issue #1416. 2) The way the code checked if an option existed in the "processed" dictionary was wrong, as we add the element with as a key associated with a NULL value, so dictFetchValue() can't be used to check for existance, but dictFind() must be used, that returns NULL only if the entry does not exist at all.	2013-12-23 12:50:27 +01:00
antirez	7e9433cee1	Configuring port to 0 disables IP socket as specified. This was no longer the case with 2.8 becuase of a bug introduced with the IPv6 support. Now it is fixed. This fixes issue #1287 and #1477.	2013-12-23 11:31:35 +01:00
antirez	94e8c9e77e	Make new masters inherit replication offsets. Currently replication offsets could be used into a limited way in order to understand, out of a set of slaves, what is the one with the most updated data. For example this comparison is possible of N slaves were replicating all with the same master. However the replication offset was not transferred from master to slaves (that are later promoted as masters) in any way, so for instance if there were three instances A, B, C, with A master and B and C replication from A, the following could happen: C disconnects from A. B is turned into master. A is switched to master of B. B receives some write. In this context there was no way to compare the offset of A and C, because B would use its own local master replication offset as replication offset to initialize the replication with A. With this commit what happens is that when B is turned into master it inherits the replication offset from A, making A and C comparable. In the above case assuming no inconsistencies are created during the disconnection and failover process, A will show to have a replication offset greater than C. Note that this does not mean offsets are always comparable to understand what is, in a set of instances, since in more complex examples the replica with the higher replication offset could be partitioned away when picking the instance to elect as new master. However this in general improves the ability of a system to try to pick a good replica to promote to master.	2013-12-22 11:43:25 +01:00
antirez	ba5eb44d14	Slave disconnection is an event worth logging.	2013-12-22 10:15:35 +01:00
antirez	66ec1412fe	Redis Cluster: add repl_ping_slave_period to slave data validity time. When the configured node timeout is very small, the data validity time (maximum data age for a slave to try a failover) is too little (ten times the configured node timeout) when the replication link with the master is mostly idle. In this case we'll receive some data from the master only every server.repl_ping_slave_period to refresh the last interaction with the master. This commit adds to the max data validity time the slave ping period to avoid this problem of slaves sensing too old data without a good reason. However this max data validity time is likely a setting that should be configurable by the Redis Cluster user in a way completely independent from the node timeout.	2013-12-22 10:05:16 +01:00
antirez	b2dedd9da8	Log when a slave lose the connection with its master.	2013-12-21 00:23:37 +01:00
antirez	658aff9d29	Redis Cluster: move node failure reports logging from VERBOSE to NOTICE level.	2013-12-21 00:04:53 +01:00
antirez	5a404c87c1	Redis Cluster: remove no longer relevant comment.	2013-12-20 14:40:11 +01:00
antirez	fda4cba912	Redis Cluster: reconfigure replication when master changes address.	2013-12-20 12:47:22 +01:00
antirez	d7374032c0	Redis Cluster: handshake code refactoring + Gossip IP switch detection. This commit makes it simple to start an handshake with a specific node address, and uses this in order to detect a node IP change and start a new handshake in order to fix the IP if possible.	2013-12-20 12:38:03 +01:00
antirez	a2c938c834	Redis Cluster: delay state change when in the majority again. As specified in the Redis Cluster specification, when a node can reach the majority again after a period in which it was partitioend away with the minorty of masters, wait some time before accepting queries, to provide a reasonable amount of time for other nodes to upgrade its configuration. This lowers the probabilities of both a client and a master with not updated configuration to rejoin the cluster at the same time, with a stale master accepting writes.	2013-12-20 09:56:18 +01:00
antirez	b3632319a4	CONFIG REWRITE: no special handling or include and rename-command. CONFIG REWRITE is now wiser and does not touch what it does not understand inside redis.conf.	2013-12-19 15:57:11 +01:00
Yubao Liu	7da423f79f	CONFIG REWRITE: don't throw some options on config rewrite Those options will be thrown without this patch: include, rename-command, min-slaves-to-write, min-slaves-max-lag, appendfilename.	2013-12-19 15:56:48 +01:00
antirez	3b9cf3ed3a	CONFIG REWRITE: old development comments removed.	2013-12-19 15:30:06 +01:00
antirez	b221e13dac	CONFIG REWRITE: don't wipe unknown options. With this commit options not explicitly rewritten by CONFIG REWRITE are not touched at all. These include new options that may not have support for REWRITE, and other special cases like rename-command and include.	2013-12-19 15:25:45 +01:00
antirez	7a666ac419	Cluster: set n->slaves to NULL in clusterNodeResetSlaves(). The value was otherwise undefined, so next time the node was promoted again from slave to master, adding a slave to the list of slaves would likely crash the server or result into undefined behavior.	2013-12-17 14:50:24 +01:00
antirez	fda91dbde3	Cluster: check link is valid before sending UPDATE.	2013-12-17 12:28:37 +01:00
antirez	f57bb36ce7	Cluster: initialize todo_before_sleep flags to 0.	2013-12-17 12:22:02 +01:00
antirez	c70c0c6db7	Cluster: use proper type mstime_t for ping delay var.	2013-12-17 10:27:36 +01:00
antirez	7c1cbdceb2	Cluster: use an hardcoded 60 sec timeout in redis-trib connections. Later this should be configurable from the command line but at least now we use something more appropriate for our use case compared to the redis-rb default timeout.	2013-12-17 10:00:33 +01:00
antirez	47815d38e0	Fixed clearNodeFailureIfNeeded() time type to mstime_t. This prevented 32bit cluster instances from clearing the FAIL flag when needed.	2013-12-17 09:45:52 +01:00
antirez	e88e6a6334	Cluster: use long long for timestamps in clusterGenNodesDescription(). Ping sent and pong received fields need to be casted to long long to be printed correctly into 32 bit systems.	2013-12-17 09:38:11 +01:00

... 2 3 4 5 6 ...

2494 Commits