ROWVERSION/TIMESTAMP skipping rows problem
Do you use ROWVERSION/TIMESTAMP to pull changed data?
If you do, you might experience a problem – not all changed rows are pulled, some are skipped. How can that be, if we pulled entire range of rowverion changes with no gaps?
Let me illustrate it in this video with demo inside:
Here is a demo script, and a way how to properly pull data without skipping rows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
--------------------------------- -- DEMO: Trap in pulling changes with ROWVERSION - and how to do it safely! --------------------------------- USE tempdb; IF OBJECT_ID('dbo.MyTable') IS NOT NULL DROP TABLE dbo.MyTable; CREATE TABLE dbo.MyTable ( MyTableId INT IDENTITY PRIMARY KEY CLUSTERED, ChangeDetection ROWVERSION UNIQUE NONCLUSTERED, SomeData CHAR(100) DEFAULT 'a' ); GO -------------------- -- SESSION A - Pulling rows wrong way (using @@DBTS) -------------------- DECLARE @PullPoint ROWVERSION=0, @NewPullPoint ROWVERSION; DECLARE @rows INT, @changes INT=0; WHILE 1=1 BEGIN SET @NewPullPoint = @@DBTS; -- pull changes SELECT @rows = COUNT(*) FROM dbo.MyTable t WHERE t.ChangeDetection > @PullPoint AND t.ChangeDetection <= @NewPullPoint; SET @PullPoint = @NewPullPoint IF @rows>0 BEGIN SET @changes+=@rows; RAISERROR('Pulled %d, total changes: %d', 0,0, @rows, @changes) WITH NOWAIT; END WAITFOR DELAY '00:00:00.01' END; GO -------------------- -- SESSION B - INSERT (1 000 000 rows) -------------------- -- These are the rows we want to pull later, in a pulling session! USE tempdb; SET NOCOUNT ON GO TRUNCATE TABLE dbo.MyTable; GO INSERT INTO dbo.MyTable(SomeData) SELECT TOP 1000 '1' FROM sys.all_columns c CROSS JOIN sys.all_columns c2 GO 1000 -- 1000 times we insert 1000 rows SELECT total_inserted=COUNT(*) FROM dbo.MyTable -- total 1 000 000 rows -------------------------------- -- Watch session A (pulling) Messages tab. -- We lost rows!!!! -- We pulled less rows than inserted, around 5% of rows are lost! -------------------------------- ------------------------------ -- THE RIGHT WAY ------------------------------ -------------------- -- SESSION A - pull the rows - CORRECT (using MIN_ACTIVE_ROWVERSION) -- No rows are skipped using this method! -------------------- DECLARE @PullPoint ROWVERSION=0, @NewPullPoint ROWVERSION; DECLARE @rows INT, @changes INT=0; WHILE 1=1 BEGIN SET @NewPullPoint = MIN_ACTIVE_ROWVERSION(); -- insted of @@DBTS! -- pull changes SELECT @rows = COUNT(*) FROM dbo.MyTable t WHERE t.ChangeDetection >= @PullPoint AND t.ChangeDetection < @NewPullPoint; -- Spot where is equality now! SET @PullPoint = @NewPullPoint IF @rows>0 BEGIN SET @changes+=@rows; RAISERROR('Pulled %d, total changes: %d', 0,0, @rows, @changes) WITH NOWAIT; END WAITFOR DELAY '00:00:00.01' END; GO -- This pulled EXACT number of rows, no data loss! --------------------- --- EXTRAS: -- -- Drawback: long transaction on ANY table with rowversion can pause advancing of change point, thus pause pulling. -- Solution: Avoid long transactions on rowversion tables. If you can't, then Change Tracking is a good alternative, or other methods (Change Data Capture). --------------------- -- Why @@DBTS does not work? -- In a single command we can insert or update multiple rows. -- @@DBTS is increased immediately as rows are changed/inserted, before commit. -- Pulling session pulls all rowversion numbers with no gaps - how that can skip rows??? -- Well, we insert rows in one session and that increases @@DBTS even before rows are visible, before the are committed. -- So pulling session does not see those uncommitted rows, but @@DBTS is increased for them and they will be skipped on next pull. -- MIN_ACTIVE_ROWVERSION will not be increased until commit, so rows are visible for pulling. |
Conclusion
Do not use @@DBTS to pull rows! Use MIN_ACTIVE_ROWVERSION() instead. I hope this helped someone.
Thanks Verdan
We’ve been using MIN_ACTIVE_ROWVERSION() quite successfully too. Except, today we noticed an unfortunate behavour when using Read-only replicas (we’re using SQL 2017).
Quite surprisingly the value of MIN_ACTIVE_ROWVERSION() is much LARGER on read-only replicas than on the primary database – this doesn’t make sense and obviously causes problems as you can end up missing a while lot of records.
Any ideas about getting a “true” active row version from read-only replicas?
Are both primary and secondary giving the same @@DBTS value? If you stop activity (changes) on primary, is MIN_ACTIVE_ROWVERSION() on both giving you the same value? Due to transaction activity MIN_ACTIVE_ROWVERSION() gives lower value than @@DBTS. Even on secondary replica if you use MIN_ACTIVE_ROWVERSION that should not skip any rows, despite being ahead of primary, if you pull always from the same (secondary) replica.
Well presented. I didn’t even know about @@DBTC but makes sense it wouldn’t be reliable. I just save last pulled value per table and save it and use it for next pull for starting point. Seems like use of default read committed isolation level seems like you can’t leak changes correct?
If you use MIN_ACTIVE_ROWVERSION, it will be ok in both default isolation (read committed) or when RCSI is enabled. If you use @@DBTS, it will be incorrect, skiping rows.
Hi
Perhaps you can help me with my similar question. I have a database in which a table with a field data type is timestamp, I noticed one feature that I still cannot understand. When I make a backup of this database and restore it on another server, I observe the following behavior, after inserting a new row into the table, the value of the timestamp field is less than in the last data row present
Excellent! Thanks!