Skip to content

Cleanup 9apr25 #4073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 231 commits into
base: cep-15-accord
Choose a base branch
from

Conversation

belliottsmith
Copy link
Contributor

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

bdeggleston and others added 30 commits March 31, 2025 17:09
patch by Blake Eggleston; reviewed by Benedict Elliott Smith, David Capwell for CASSANDRA-17103
Patch by Blake Eggleston; Reviewed by David Capwell & Benedict Elliott Smith for CASSANDRA-18004
Patch by Blake Eggleston; Reviewed by David Capwell and Benedict Elliott Smith for Cassandra-18192
…n merging to mainline

patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-18309
…ce RandomSource

patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-18213
patch by Jacek Lewandowski; reviewed by David Capwell and Caleb Rackliffe for CASSANDRA-18302
…ss when TransactionStatement is prepared

patch by David Capwell; reviewed by Ariel Weisberg, Caleb Rackliffe for CASSANDRA-18337
…ID_RSP but got ACCORD_SIMPLE_RSP

patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-18375
…ition with PreApply where reads and writes are interleaved, causing one of the coordinators to see the writes from the other

patch by David Capwell; reviewed by Ariel Weisberg for CASSANDRA-18422
…o a Stage and run directly in the messageing handler

patch by David Capwell; reviewed by Ariel Weisberg, Benedict Elliott Smith for CASSANDRA-18364
…rt tests to add custom logic

patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-18485
in a durable log before processing by CommandStores

patch by Aleksey Yeschenko; reviewed by David Capwell for
CASSANDRA-18344
…tion history

patch by Benedict; reviewed by Blake Eggleston for CASSANDRA-18523
- removing unnecessary calls to ServerTestUtils.daemonInitialization() in a handful of tests
- minor cleanup in Verb and BTreeSet
patch by David Capwell; reviewed by Ariel Weisberg for CASSANDRA-18519
patch by Aleksey Yeschenko; reviewed by Benedic Elliott Smith for
CASSANDRA-18561
patch by Aleksey Yeschenko; reviewed by Blake Eggleston for
CASSANDRA-18563
Patch by Blake Eggleston and Benedict Elliott Smith; Reviewed by David
Capwell for CASSANDRA-17101

CEP-15: Accord TCM integration

Patch by Blake Eggleston; Reviewed by David Capwell for CASSANDRA-18444
…ctions that are known to be applied across the cluster)

patch by Benedict Elliott Smith; reviewed by Ariel Weisberg, Aleksey Yeschenko, and David Capwell for CASSANDRA-18883

Co-authored-by: Benedict Elliott Smith <benedict@apache.org>
Co-authored-by: Ariel Weisberg <aweisberg@apple.com>
Co-authored-by: Aleksey Yeschenko <aleksey@apache.org>
Co-authored-by: David Capwell <dcapwell@gmail.com>
…ing right away (apache#3575)

patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-18764
apache/cassandra-accord#56

Patch by Ariel Weisberg; Reviewed by David Capwell for CASSANDRA-18779
…ure not to loose the partial deps (apache#3590)

patch by David Capwell; reviewed by Aleksey Yeschenko for CASSANDRA-18783
Accord compaction purgers see random slices of Accord state during compaction (based on randomly selected compaction inputs).

For at least the `durability` column in the `commands` table the tombstone being created when truncating was deleting the latest value since we can get enough information to truncate without actuall yhaving the latest `durability` value.

To fix we can wait to emit a tombstone until we are erasing the entire command row when truncating or truncating with outcome and meanwhile we can drop the extra columns that are no longer needed instead of using a tombstone. We don't need to emit cell tombstones we can drop them from the purger when processing each row.

patch by Ariel Weisberg; reviewed by David Capwell for CASSANDRA-18795
…s nullable but C* serializer doesn't expect null
…s in TxnWrite, as they can simply be pulled from PartialTxn when needed in Write#apply()

- Avoid serializing full TxnData instances to Accord state tables

patch by Caleb Rackliffe; reviewed by David Capwell, Benedict Elliot Smith, and Ariel Weisberg for CASSANDRA-18355
patch by Aleksey Yeschenko; reviewed by Ariel Weisberg for CASSANDRA-18573
ifesdjeen and others added 21 commits April 2, 2025 14:37
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20316
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20358
Improve:
 - Introduce pre/accept fast execution flags
 - Introduce searchable Deps serialization
 - Flatten AccordRoutingKey(s) into single type, using sentinel bits
 - Introduce new fast byte-comparable serialization methods for Token and TableId to support above
Fix:
 - Fix journal re-serialization logic
 - Enable RandomPartitioner for Accord by supporting fixed-width serialization for RouteIndex

patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20349
 - Only use persisted RedundantBefore for compaction
 - RouteIndex should index only touches, not Route
 - Flush RangesForEpoch updates to journal immediately, so we do not rely on the command we are processing succeeding
 - DurableBefore updates must wait for the epochs to be known locally
 - Shard.mustWitnessEpoch to support guaranteeing to witness relevant non-topology schema changes
 - We must propagate RedundantBefore RX shard bounds along with epoch syncs
 - Prevent a truncated transaction FetchData infinite loop
 - GC_BEFORE status being overwritten by bootstrappedAt, permitting old transaction state to be resurrected
 - Avoid CFK.maxUniqueHlc read race on bootstrap
 - TopologyManager.awaitEpoch could wait for wrong epoch
 - Journal fsync thread could miss notifications
Also improve:
 - CommandStores uses SearchableRangeList for finding matching stores
 - Refactor RedundantBefore to use a sorted array of TxnId/RedundantStatus pairs (to better fix GC_BEFORE issue)
 - Accord debug keyspace operates on keyspace/table, and sorts correctly by token

patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20361
 - Bad ArrayBuffers recycling logic
 - RX must ensure dependencies TRANSITIVE_VISIBLE
 - Permit constructing "antiRange" that spans multiple prefixes
 - Not computing range CommandSummary IsDep correctly
 - Truncated commands that aren't shard durable could not repopulate CFK on replay, permitting recovery of another command to make an incorrect decision
 - NPE on async persist of RX (i.e. supplying no callback)
 - NPE in Builder.shouldCleanup when durability is null

patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20370
…ding topology and other durability requirements.

Also improve:
 - Introduce DurabilityService
 - Retire SyncPoint, replace Barrier with Write and RX
 - MessageType -> enum, restore GetMaxConflict
 - Standardise backoff logic with WaitStrategy
 - improve TimeoutStrategy/RetryStrategy specification strings
 - Forbid KX, remove directKeyDeps
 - Introduce UniqueTimeService, permitting hlc reservations for sync points avoid delay when min TxnId is sufficiently in the past
 - Remove ListStore custom purge logic
Also fix:
 - RejectBefore should reject on both epoch and hlc
 - Do not record sync success for removed nodes
 - Support GlobalDurability detecting no command store to run on
 - Incorrect ballot constructor
 - Serializing 15-bit ballot flags incorrectly
 - TopologyManager.hasEpoch deadlock
 - Computing withOpenEpochs incorrectly, sometimes stopping one epoch short
 - PartitionKey serializer should not depend on schema information that can be erased

patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20395
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20393.
patch by Benedict Elliott Smith; reviewed by David Capwell for CASSANDRA-20420
…a NPE

patch by David Capwell; reviewed by Benedict Elliott Smith for CASSANDRA-20417
Patch by Alex Petrov; reviewed by Benedict Elliott Smith CASSANDRA-20347.
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20424.
… required topology changes

patch by David Capwell; reviewed by Benedict Elliott Smith for CASSANDRA-20426
 - Decouple command serialization from TableMetadata version; introduce ColumnMetadata ids; gracefully handle missing TableId
 - DataInputPlus.readLeastSignificantBytes must truncate high bits
 - Fix RandomPartitioner accord serialization
 - Fast path stable commits must not override recovery propose/commit decisions regarding visibility of a transaction
 - RejectBefore must mergeMax, not merge, to ensure we maintain epoch and hlc increasing independently
 - Bad commitInvalidate decision
 - consistent filtering for touches and stillTouches
 - ensure TRUNCATE_BEFORE implies SHARD_APPLIED
 - TopologyManager.unsyncedOnly off-by-one error
 - DurabilityQueue should not retry SyncPointErased
 - handle rare case of no deps but none needed
 - not updating CFK synchronously on recovery, which can lead to erroneous recovery decisions for other transactions
 - Don't return partial read response when one commandStore rejects the commit
 - Filter touches/stillTouches consistently
 - WaitingState computeLowEpoch must use hasTouched to handle historic key with no route
Improve:
 - Use format parameters to defer building Invariants.requireArgument string
 - streamline RedundantStatus/RedundantBefore
Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20114
…e changes without impacting C* messaging

patch by David Capwell; reviewed by Aleksey Yeschenko, Alex Petrov for CASSANDRA-20403
 - Accord Journal purging was disabled
 - remove unique_id from schema keyspace
 - avoid String.format in Compactor hot path
 - avoid string concatenation on hot path; improve segment compactor partition build efficiency
 - Partial compaction should update records in place to ensure truncation of discontiguous compactions do not lead to an incorrect field version being used
 - StoreParticipants.touches behaviour for RX was erroneously modified; should touch all non-redundant ranges including those no longer owned
 - SetShardDurable should correctly set DurableBefore Majority/Universal based on the Durability parameter
 - fix erroneous prunedBefore invariant
 - Journal compaction should not rewrite fields shadowed by a newer record
 - Don't save updates to ERASED commands
 - Simplify CommandChange.getFlags
 - fix handling of Durability for Invalidated
 - Don't use ApplyAt for GC_BEFORE with partial input, as might be a saveStatus >= ApplyAtKnown but with executeAt < ApplyAtKnown

patch by Benedict; reviewed by Alex Petrov for CASSANDRA-20441
  * Fix short accord simulation test (seed 0x6bea128ae851724b), ConcurrentModificationException
  * Increase wait time during closing to avoid Unterminated threads
  * Increase timeouts, improve test stability
  * More descriptive output from CQL test
  * Shorten max CMS delay
  * Improve future handling in config service

Patch by Alex Petrov; reviewed by Benedict Elliott Smith for CASSANDRA-20440
…YNC mode

patch by Benedict; reviewed by David Capwell for CASSANDRA-20521
patch by Benedict; reviewed by David Capwell for CASSANDRA-20529
…p-15-accord

patch by David Capwell; reviewed by Caleb Rackliffe for CASSANDRA-20535
@@ -49,6 +51,9 @@ public static void handle(Throwable t)
{
try
{
if (t instanceof RequestTimeoutException || t instanceof CancellationException)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe we talked about this before, this is to avoid logging CancellationException right? is RequestTimeoutException getting logged or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, reported twice I think, once by the Future we use to report the error, and once by the AccordAgent.

ACCORD_GET_DURABLE_BEFORE_RSP (154, P2, readTimeout, IMMEDIATE, () -> accordEmbedded(GetDurableBeforeSerializers.reply), AccordService::responseHandlerOrNoop ),
ACCORD_GET_DURABLE_BEFORE_REQ (155, P2, readTimeout, IMMEDIATE, () -> accordEmbedded(GetDurableBeforeSerializers.request), AccordService::requestHandlerOrNoop, ACCORD_GET_DURABLE_BEFORE_RSP ),
ACCORD_GET_DURABLE_BEFORE_RSP (154, P2, readTimeout, MISC, () -> accordEmbedded(GetDurableBeforeSerializers.reply), AccordService::responseHandlerOrNoop ),
ACCORD_GET_DURABLE_BEFORE_REQ (155, P2, readTimeout, MISC, () -> accordEmbedded(GetDurableBeforeSerializers.request), AccordService::requestHandlerOrNoop, ACCORD_GET_DURABLE_BEFORE_RSP ),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While investigating I saw a lot of dropped messages I couldn't explain, so I looked for possible handlers that should not be executed on the IMMEDIATE stage. This one is probably fine to be honest as it's non-blocking, but it isn't sent to the any CommandStore for processing so the processing would be done on the netty event loop

@belliottsmith belliottsmith force-pushed the cleanup-9apr25 branch 3 times, most recently from 4b79471 to 2e1712d Compare April 10, 2025 09:29
- Don't report request timeouts or cancellations to jvm stability inspector
- Don't report `simple` Keyspace FastPathStrategy

patch by Benedict; reviewed by David Capwell for CASSANDRA-20545
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants