-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Cleanup 9apr25 #4073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: cep-15-accord
Are you sure you want to change the base?
Cleanup 9apr25 #4073
Conversation
patch by Blake Eggleston; reviewed by Benedict Elliott Smith, David Capwell for CASSANDRA-17103
Patch by Blake Eggleston; Reviewed by David Capwell & Benedict Elliott Smith for CASSANDRA-18004
Patch by Blake Eggleston; Reviewed by David Capwell and Benedict Elliott Smith for Cassandra-18192
…n merging to mainline patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-18309
…ce RandomSource patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-18213
patch by Jacek Lewandowski; reviewed by David Capwell and Caleb Rackliffe for CASSANDRA-18302
…cep-15-accord on cep-21-tcm
…ss when TransactionStatement is prepared patch by David Capwell; reviewed by Ariel Weisberg, Caleb Rackliffe for CASSANDRA-18337
…ID_RSP but got ACCORD_SIMPLE_RSP patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-18375
…ition with PreApply where reads and writes are interleaved, causing one of the coordinators to see the writes from the other patch by David Capwell; reviewed by Ariel Weisberg for CASSANDRA-18422
…o a Stage and run directly in the messageing handler patch by David Capwell; reviewed by Ariel Weisberg, Benedict Elliott Smith for CASSANDRA-18364
…rt tests to add custom logic patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-18485
in a durable log before processing by CommandStores patch by Aleksey Yeschenko; reviewed by David Capwell for CASSANDRA-18344
…tion history patch by Benedict; reviewed by Blake Eggleston for CASSANDRA-18523
- removing unnecessary calls to ServerTestUtils.daemonInitialization() in a handful of tests - minor cleanup in Verb and BTreeSet
patch by David Capwell; reviewed by Ariel Weisberg for CASSANDRA-18519
patch by Aleksey Yeschenko; reviewed by Benedic Elliott Smith for CASSANDRA-18561
patch by Aleksey Yeschenko; reviewed by Blake Eggleston for CASSANDRA-18563
Patch by Blake Eggleston and Benedict Elliott Smith; Reviewed by David Capwell for CASSANDRA-17101 CEP-15: Accord TCM integration Patch by Blake Eggleston; Reviewed by David Capwell for CASSANDRA-18444
…ctions that are known to be applied across the cluster) patch by Benedict Elliott Smith; reviewed by Ariel Weisberg, Aleksey Yeschenko, and David Capwell for CASSANDRA-18883 Co-authored-by: Benedict Elliott Smith <benedict@apache.org> Co-authored-by: Ariel Weisberg <aweisberg@apple.com> Co-authored-by: Aleksey Yeschenko <aleksey@apache.org> Co-authored-by: David Capwell <dcapwell@gmail.com>
…ing right away (apache#3575) patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-18764
apache/cassandra-accord#56 Patch by Ariel Weisberg; Reviewed by David Capwell for CASSANDRA-18779
…ure not to loose the partial deps (apache#3590) patch by David Capwell; reviewed by Aleksey Yeschenko for CASSANDRA-18783
Accord compaction purgers see random slices of Accord state during compaction (based on randomly selected compaction inputs). For at least the `durability` column in the `commands` table the tombstone being created when truncating was deleting the latest value since we can get enough information to truncate without actuall yhaving the latest `durability` value. To fix we can wait to emit a tombstone until we are erasing the entire command row when truncating or truncating with outcome and meanwhile we can drop the extra columns that are no longer needed instead of using a tombstone. We don't need to emit cell tombstones we can drop them from the purger when processing each row. patch by Ariel Weisberg; reviewed by David Capwell for CASSANDRA-18795
…s nullable but C* serializer doesn't expect null
…s in TxnWrite, as they can simply be pulled from PartialTxn when needed in Write#apply() - Avoid serializing full TxnData instances to Accord state tables patch by Caleb Rackliffe; reviewed by David Capwell, Benedict Elliot Smith, and Ariel Weisberg for CASSANDRA-18355
patch by Aleksey Yeschenko; reviewed by Ariel Weisberg for CASSANDRA-18573
…, SeedDefiner, RunStartDefiner, and Config
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20316
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20358
Improve: - Introduce pre/accept fast execution flags - Introduce searchable Deps serialization - Flatten AccordRoutingKey(s) into single type, using sentinel bits - Introduce new fast byte-comparable serialization methods for Token and TableId to support above Fix: - Fix journal re-serialization logic - Enable RandomPartitioner for Accord by supporting fixed-width serialization for RouteIndex patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20349
- Only use persisted RedundantBefore for compaction - RouteIndex should index only touches, not Route - Flush RangesForEpoch updates to journal immediately, so we do not rely on the command we are processing succeeding - DurableBefore updates must wait for the epochs to be known locally - Shard.mustWitnessEpoch to support guaranteeing to witness relevant non-topology schema changes - We must propagate RedundantBefore RX shard bounds along with epoch syncs - Prevent a truncated transaction FetchData infinite loop - GC_BEFORE status being overwritten by bootstrappedAt, permitting old transaction state to be resurrected - Avoid CFK.maxUniqueHlc read race on bootstrap - TopologyManager.awaitEpoch could wait for wrong epoch - Journal fsync thread could miss notifications Also improve: - CommandStores uses SearchableRangeList for finding matching stores - Refactor RedundantBefore to use a sorted array of TxnId/RedundantStatus pairs (to better fix GC_BEFORE issue) - Accord debug keyspace operates on keyspace/table, and sorts correctly by token patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20361
- Bad ArrayBuffers recycling logic - RX must ensure dependencies TRANSITIVE_VISIBLE - Permit constructing "antiRange" that spans multiple prefixes - Not computing range CommandSummary IsDep correctly - Truncated commands that aren't shard durable could not repopulate CFK on replay, permitting recovery of another command to make an incorrect decision - NPE on async persist of RX (i.e. supplying no callback) - NPE in Builder.shouldCleanup when durability is null patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20370
…ding topology and other durability requirements. Also improve: - Introduce DurabilityService - Retire SyncPoint, replace Barrier with Write and RX - MessageType -> enum, restore GetMaxConflict - Standardise backoff logic with WaitStrategy - improve TimeoutStrategy/RetryStrategy specification strings - Forbid KX, remove directKeyDeps - Introduce UniqueTimeService, permitting hlc reservations for sync points avoid delay when min TxnId is sufficiently in the past - Remove ListStore custom purge logic Also fix: - RejectBefore should reject on both epoch and hlc - Do not record sync success for removed nodes - Support GlobalDurability detecting no command store to run on - Incorrect ballot constructor - Serializing 15-bit ballot flags incorrectly - TopologyManager.hasEpoch deadlock - Computing withOpenEpochs incorrectly, sometimes stopping one epoch short - PartitionKey serializer should not depend on schema information that can be erased patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20395
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20393.
patch by Benedict Elliott Smith; reviewed by David Capwell for CASSANDRA-20420
…a NPE patch by David Capwell; reviewed by Benedict Elliott Smith for CASSANDRA-20417
Patch by Alex Petrov; reviewed by Benedict Elliott Smith CASSANDRA-20347.
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20424.
… required topology changes patch by David Capwell; reviewed by Benedict Elliott Smith for CASSANDRA-20426
- Decouple command serialization from TableMetadata version; introduce ColumnMetadata ids; gracefully handle missing TableId - DataInputPlus.readLeastSignificantBytes must truncate high bits - Fix RandomPartitioner accord serialization - Fast path stable commits must not override recovery propose/commit decisions regarding visibility of a transaction - RejectBefore must mergeMax, not merge, to ensure we maintain epoch and hlc increasing independently - Bad commitInvalidate decision - consistent filtering for touches and stillTouches - ensure TRUNCATE_BEFORE implies SHARD_APPLIED - TopologyManager.unsyncedOnly off-by-one error - DurabilityQueue should not retry SyncPointErased - handle rare case of no deps but none needed - not updating CFK synchronously on recovery, which can lead to erroneous recovery decisions for other transactions - Don't return partial read response when one commandStore rejects the commit - Filter touches/stillTouches consistently - WaitingState computeLowEpoch must use hasTouched to handle historic key with no route Improve: - Use format parameters to defer building Invariants.requireArgument string - streamline RedundantStatus/RedundantBefore
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20114
…e changes without impacting C* messaging patch by David Capwell; reviewed by Aleksey Yeschenko, Alex Petrov for CASSANDRA-20403
- Accord Journal purging was disabled - remove unique_id from schema keyspace - avoid String.format in Compactor hot path - avoid string concatenation on hot path; improve segment compactor partition build efficiency - Partial compaction should update records in place to ensure truncation of discontiguous compactions do not lead to an incorrect field version being used - StoreParticipants.touches behaviour for RX was erroneously modified; should touch all non-redundant ranges including those no longer owned - SetShardDurable should correctly set DurableBefore Majority/Universal based on the Durability parameter - fix erroneous prunedBefore invariant - Journal compaction should not rewrite fields shadowed by a newer record - Don't save updates to ERASED commands - Simplify CommandChange.getFlags - fix handling of Durability for Invalidated - Don't use ApplyAt for GC_BEFORE with partial input, as might be a saveStatus >= ApplyAtKnown but with executeAt < ApplyAtKnown patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20441
* Fix short accord simulation test (seed 0x6bea128ae851724b), ConcurrentModificationException * Increase wait time during closing to avoid Unterminated threads * Increase timeouts, improve test stability * More descriptive output from CQL test * Shorten max CMS delay * Improve future handling in config service Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20440
…YNC mode patch by Benedict; reviewed by David Capwell for CASSANDRA-20521
patch by Benedict; reviewed by David Capwell for CASSANDRA-20529
…p-15-accord patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-20535
@@ -49,6 +51,9 @@ public static void handle(Throwable t) | |||
{ | |||
try | |||
{ | |||
if (t instanceof RequestTimeoutException || t instanceof CancellationException) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe we talked about this before, this is to avoid logging CancellationException
right? is RequestTimeoutException
getting logged or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, reported twice I think, once by the Future
we use to report the error, and once by the AccordAgent
.
ACCORD_GET_DURABLE_BEFORE_RSP (154, P2, readTimeout, IMMEDIATE, () -> accordEmbedded(GetDurableBeforeSerializers.reply), AccordService::responseHandlerOrNoop ), | ||
ACCORD_GET_DURABLE_BEFORE_REQ (155, P2, readTimeout, IMMEDIATE, () -> accordEmbedded(GetDurableBeforeSerializers.request), AccordService::requestHandlerOrNoop, ACCORD_GET_DURABLE_BEFORE_RSP ), | ||
ACCORD_GET_DURABLE_BEFORE_RSP (154, P2, readTimeout, MISC, () -> accordEmbedded(GetDurableBeforeSerializers.reply), AccordService::responseHandlerOrNoop ), | ||
ACCORD_GET_DURABLE_BEFORE_REQ (155, P2, readTimeout, MISC, () -> accordEmbedded(GetDurableBeforeSerializers.request), AccordService::requestHandlerOrNoop, ACCORD_GET_DURABLE_BEFORE_RSP ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While investigating I saw a lot of dropped messages I couldn't explain, so I looked for possible handlers that should not be executed on the IMMEDIATE stage. This one is probably fine to be honest as it's non-blocking, but it isn't sent to the any CommandStore
for processing so the processing would be done on the netty event loop
4b79471
to
2e1712d
Compare
2e1712d
to
9f5f692
Compare
Thanks for sending a pull request! Here are some tips if you're new here:
Commit messages should follow the following format:
The Cassandra Jira