Home CMS Planet MySQL

    PostHeaderIcon Newsfeeds

    Planet MySQL
    Planet MySQL - https://planet.mysql.com

    • Flashback Recovery in MariaDB/MySQL Servers
      In this blog, we will see how to do a flashback recovery or rolling back the data in MariaDB, MySQL and Percona. As we know the saying  “All humans make mistakes”, following that in Database environment the data modified accidentally can bring havoc to any organisations. Recover the lost data The data can be recovered from the latest full backup or incremental backup when data size is huge it could take hours to restore it. From backup of Binlogs. Data can also be recovered from delayed slaves, this case would be helpful when the mistake is found immediately, within the period of delay. We can use anyone of the above ways or other that can help to recover the lost data, but what really matters is, What is the time taken to rollback or recover the data? and How much downtime was taken to get back to the initial state ? To overcome this disaster mysqlbinlog (MariaDB 10.2) has a very useful option i.e –flashback that comes along with binary of MariaDB server 10.2 linux ,debian and ubuntu ,though it comes with MariaDB server, it works well with Oracle Mysql servers and Percona flavour of MySQL. What is Flashback? Restoring back the data to the previous snapshot in a MySQL database or in a table is called Flashback. Flashback options help us to undo the executed row changes(DML events). For instance, it can change DELETE events to INSERTs and vice versa, and also it will swap WHERE and SET parts of the UPDATE events. Prerequisites for using flashback : binlog_format = ROW binlog_row_image = FULL Flash back uses the mysqlbinlog to create the rollback statements and it needs a FULL image (Minimal is not supported). Let us simulate a few test cases where flashback comes as a boon for recovering data. For simulating the test cases I am using employees table and Mariadb version 10.2 MariaDB [employees]> select @@version; +---------------------+ | @@version           | +---------------------+ | 10.2.23-MariaDB-log | +---------------------+ 1 row in set (0.02 sec) Table structure : MariaDB [employees]> show create table employees\G *************************** 1. row ***************************        Table: employees Create Table: CREATE TABLE `employees` (   `emp_no` int(11) NOT NULL,   `birth_date` date NOT NULL,   `first_name` varchar(14) NOT NULL,   `last_name` varchar(16) NOT NULL,   `gender` enum('M','F') NOT NULL,   `hire_date` date NOT NULL,   PRIMARY KEY (`emp_no`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) Case 1:  Rollback the Deleted data. Consider the data is deleted was from employees table where first_name =’Chirstian’ . MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian'; +----------+ | COUNT(*) | +----------+ |      226 | +----------+ 1 row in set (0.07 sec) MariaDB [employees]> delete from employees where first_name ='Chirstian'; Query OK, 226 rows affected (0.15 sec) To revert the data to the previous state ,we need to decode the binlog and fetch the right start and stop position of the delete event happened on employees table. Start position should be taken exactly after BEGIN and Stop position is before the final COMMIT. [root@vm3 vagrant]# mysqlbinlog -v --base64-output=DECODE-ROWS /var/lib/mysql/mysql-bin.000007 > mysql-bin.000007.txt BEGIN /*!*/; # at 427 # at 501 #190417 17:49:49 server id 1  end_log_pos 501 CRC32 0xc7f1c84b  Annotate_rows: #Q> delete from employees where first_name ='Chirstian' #190417 17:49:49 server id 1  end_log_pos 569 CRC32 0x6b1b5c98  Table_map: `employees`.`employees` mapped to number 29 # at 569 #190417 17:49:49 server id 1  end_log_pos 7401 CRC32 0x6795a972         Delete_rows: table id 29 flags: STMT_END_F ### DELETE FROM `employees`.`employees` ### WHERE ###   @1=10004 ###   @2='1954:05:01' ###   @3='Chirstian' ###   @4='Koblick' ###   @5=1 ###   @6='1986:12:01' # at 23733 #190417 17:49:49 server id 1  end_log_pos 23764 CRC32 0xf9ed5c3e        Xid = 455 ### DELETE FROM `employees`.`employees` ### WHERE ### @1=498513 ### @2='1964:10:01' ### @3='Chirstian' ### @4='Mahmud' ### @5=1 ### @6='1992:06:03' # at 7401 COMMIT/*!*/; # at 23764 #190417 17:49:49 server id 1  end_log_pos 23811 CRC32 0x60dfac86        Rotate to mysql-bin.000008  pos: 4 DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; Once the count is verified the from the taken positions, we can prepare rollback statements  as a sql file using flashback as below [root@vm3 vagrant]# mysqlbinlog  -v --flashback --start-position=427 --stop-position=7401 /var/lib/mysql/mysql-bin.000007  > insert.sql Below is the comparison of conversion from Delete to Insert for a single record: ### DELETE FROM `employees`.`employees` ### WHERE ### @1=498513 ### @2='1964:10:01' ### @3='Chirstian' ### @4='Mahmud' ### @5=1 ### @6='1992:06:03' ### INSERT INTO `employees`.`employees` ### SET ### @1=498513 ### @2='1964:10:01' ### @3='Chirstian' ### @4='Mahmud' ### @5=1 ### @6='1992:06:03' MariaDB [employees]> source insert.sql Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) And the count is verified after the data load. MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian'; +----------+ | COUNT(*) | +----------+ |      226 | +----------+ 1 row in set (0.06 sec) Case 2 :  Rollbacking the Updated data. The data was updated based on below conditions MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian' and gender='M'; +----------+ | COUNT(*) | +----------+ |      129 | +----------+ 1 row in set (0.14 sec) MariaDB [employees]> update employees set gender='F' where first_name ='Chirstian' and gender='M'; Query OK, 129 rows affected (0.16 sec) Rows matched: 129  Changed: 129  Warnings: 0 MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian' and gender='M'; +----------+ | COUNT(*) | +----------+ |        0 | +----------+ 1 row in set (0.07 sec) To rollback the updated data, the same steps to be followed as in case 1. [root@vm3 vagrant]# mysqlbinlog -v --flashback --start-position=427 --stop-position=8380 /var/lib/mysql/mysql-bin.000008 > update.sql MariaDB [employees]> source update.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian' and gender='M'; +----------+ | COUNT(*) | +----------+ |      129 | +----------+ 1 row in set (0.06 sec) In the above two cases by using flashback option we were able to change Event Type statements from DELETE to INSERT and Update_Event statements by Swapping the SET part and WHERE part. Rollbacking data based on Time  There may be chance where DBA’s get the request from team to rollback the deleted /updated data. In those cases DBA flashback can be used with –start-datetime and –stop-datetime options. Let us consider we get an information that data was deleted at approximate datetime 2019-05-17 4:15:00 and below query was executed. MariaDB [employees]> delete from employees where emp_no between 10101 and 10210; Query OK, 110 rows affected (0.00 sec) MariaDB [employees]> select count(*) from employees where emp_no between 10101 and 10210; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec) By Analysing available binary logs , we can fetch the required binlog for the provided datetime. -rw-rw----. 1 mysql mysql 1.1G May 16 03:11 mysql-bin.000025 -rw-rw----. 1 mysql mysql 1.1G May 16 03:36 mysql-bin.000026 -rw-rw----. 1 mysql mysql 1.1G May 16 04:02 mysql-bin.000027 -rw-rw----. 1 mysql mysql 1.1G May 16 05:10 mysql-bin.000028 -rw-rw----. 1 mysql mysql 1.1G May 16 05:32 mysql-bin.000028 To recover the data based on datetime, we should use date and time local timezone. mysql-bin.000027 is the binlog which is nearest time to 4:15:00  snap from binlog BEGIN /*!*/; # at 662711655 # at 662711736 #190516  4:39:42 server id 1  end_log_pos 662711736 CRC32 0xcd7cd191    Annotate_rows: #Q> delete from employees where emp_no between 10101 and 10210 #190516  4:39:42 server id 1  end_log_pos 662711804 CRC32 0x77719f68    Table_map: `employees`.`employees` mapped to number 29 # at 662711804 #190516  4:39:42 server id 1  end_log_pos 662714885 CRC32 0xde765d28    Delete_rows: table id 29 flags: STMT_END_F ### DELETE FROM `employees`.`employees` ### WHERE ###   @1=10101 ###   @2='1952:04:15' ###   @3='Perla' ###   @4='Heyers' ###   @5=2 ###   @6='1992:12:28' ### DELETE FROM `employees`.`employees` ### WHERE ###   @1=10102 ###   @2='1959:11:04' ###   @3='Paraskevi' ###   @4='Luby' ###   @5=2 ###   @6='1994:01:26' ### DELETE FROM `employees`.`employees` ### WHERE ###   @1=10210 ###   @2='1958:01:24' ###   @3='Yuping' ###   @4='Alpin' ###   @5=1 ###   @6='1994:05:10' # at 662714885 #190517  4:39:42 server id 1  end_log_pos 662714916 CRC32 0xf9ba3c0a    Xid = 2860541 COMMIT/*!*/; # at 662714916 The from the binlog we can see actual delete was happened at 4:39:42 [root@vm3]# mysqlbinlog --flashback --start-datetime="2019-05-17 04:39:00"  /var/lib/mysql/mysql-bin.000027  -v --database=employees --table=employees  > insert.sql MariaDB [employees]> source insert.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) MariaDB [employees]> select count(*) from employees where emp_no between 10101 and 10210; +----------+ | count(*) | +----------+ |      110 | +----------+ 1 row in set (0.04 sec) Using Flashback in MySQL Community/Percona server. To check the compatibility of mysqlbinlog tool with Mysql/Percona variants , i have also tested the with MySQL version 5.7.24 . mysql> select @@version; +---------------+ | @@version     | +---------------+ | 5.7.24-27-log | +---------------+ 1 row in set (0.00 sec) mysql> select count(*) from employees where emp_no between 10001 and 10110; +----------+ | count(*) | +----------+ |      110 | +----------+ 1 row in set (0.00 sec) mysql> delete from employees where emp_no between 10001 and 10110; Query OK, 110 rows affected (0.00 sec) Flashback option comes only with Mariadb 10.2 server and above ,so i have copied the binlog which contains my lost transactions to a new server with Mariadb binaries installed. Mariadb server (binaries) can also be installed in a local system (linux and Debian and Ubuntu) or the binlogs can be copied to pre-installed Mariadb server. Snap of delete events from MySQL 5.7 binlog BEGIN /*!*/; # at 336 #190506 10:04:25 server id 11  end_log_pos 404 CRC32 0x2b6e0318         Table_map: `employees`.`employees` mapped to number 255 # at 404 #190506 10:04:25 server id 11  end_log_pos 3459 CRC32 0x64cf7a16        Delete_rows: table id 255 flags: STMT_END_F ### DELETE FROM `employees`.`employees` ### WHERE ###   @1=10001 ###   @2='1953:09:02' ###   @3='Georgi' ###   @4='Facello' ###   @5=1 ###   @6='1986:06:26' ### DELETE FROM `employees`.`employees` ### WHERE ###   @1=10002 ###   @2='1964:06:02' ###   @3='Bezalel' ###   @4='Simmel' ###   @5=2 ###   @6='1985:11:21' mysqlbinlog --flashback -v --start-position=336 --stop-position=3459 mysql-bin.000024 > insert_57.sql mysql> source insert_57.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) mysql> select count(*) from employees where emp_no between 10001 and 10110; +----------+ | count(*) | +----------+ |      110 | +----------+ 1 row in set (0.00 sec) mysql> select @@version; +---------------+ | @@version     | +---------------+ | 5.7.24-27-log | +---------------+ 1 row in set (0.00 sec) The MariaDB flashback option works fine with MySQL variants too. Flashback with MySQL GTID The flashback works fine with the conversion of delete into insert statements in GTID, but it fails to execute the when the data is loaded. Snap of binlog from GTID enabled MySQL server SET @@SESSION.GTID_NEXT= '7d60dbfb-29dd-11e9-a71a-080027dfb17a:64'/*!*/; # at 259 #190506 11:36:29 server id 11  end_log_pos 336 CRC32 0xaa55b2c9         Query   thread_id=29022 exec_time=0     error_code=0 SET TIMESTAMP=1557142589/*!*/; SET @@session.pseudo_thread_id=29022/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; BEGIN /*!*/; # at 336 #190506 11:36:29 server id 11  end_log_pos 404 CRC32 0x8e10c960         Table_map: `employees`.`employees` mapped to number 255 # at 404 #190506 11:36:29 server id 11  end_log_pos 3459 CRC32 0x02d82af3        Delete_rows: table id 255 flags: STMT_END_F ### DELETE FROM `employees`.`employees` ### WHERE ###   @1=10001 ###   @2='1953:09:02' ###   @3='Georgi' ###   @4='Facello' ###   @5=1 ###   @6='1986:06:26' mysqlbinlog --flashback -v --start-position=336 --stop-position=3459 mysql-bin.000029  > gtid.sql mysql> source gtid.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON. ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON. ERROR 1782 (HY000): @@SESSION.GTID_NEXT cannot be set to ANONYMOUS when @@GLOBAL.GTID_MODE = ON. Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) In this case, we need to disable the GTID mode and restore data,ta , which in painful for production servers. Limitations of Flashback  It Doesn’t support DDL ( DROP/TRUNCATE or other DDL’s) It Doesn’t support encrypted binlog It Doesn’t support compressed binlog Use cases of flashback When we are not aware of exact start and stop positions , use –start-datetime and –stop-datetime Also, we can transform the changes for a particular database or a table using the options –database(-d) and table (-T) For more Binlog options refer Link Key Takeaways: To reverse the mishandled operations from binary logs. No need to stop the server to carry out this operation. When the data is small to revert back, flashback process is very faster than recovering the data from Full Backup. Point in time recovery (PITR) becomes easy. There is a similar open source tool called binlog2sql developed by US team review DBA team (Shanghai) , which works on the same principle of mysqlbinlog flashback. This tool can also be used based on convienient and usecase. Photo by Jiyeon Park on Unsplash

    • Performance Tuning Tungsten Replication to MySQL
      The Question Recently, a customer asked us: Why would Tungsten Replicator be slow to apply to MySQL? The Answer Performance Tuning 101 When you run trepctl status and see:appliedLatency : 7332.394 like this on a slave, it is almost always due to the inability for the target database to keep up with the applier. This means that we often need to look first to the database layer for the solution. Here are some of the things to think about when dealing with this issue: Architecture and Environment√ Are you on bare metal?√ Using the cloud?√ Dev or Prod?√ Network speed and latency?√ Distance the data needs to travel?√ Network round trip times? Is the replicator applying to a database installed on the same server or is it applying over the network to a remote server? shell> time mysql -e "select 1" ... real 0m0.004s user 0m0.003s sys 0m0.000s Observe the value for real – if it is 15ms or more chances are you will see slow apply rates. MySQL Binary Logging√ What binary logging format are you using?mysql> select @@global.binlog_format; For non-Multi-Master deployments, use MIXED For Multi-Master topologies, use ROW MySQL Tables√ Verify that all tables are InnoDB.√ Also make sure all tables have a Primary Key.√ Do the tables have proper indexes?√ Use the slow query log to identify if any tungsten-owned queries are taking a long time√ The MySQL EXPLAIN command is very useful in understanding slow queries:https://dev.mysql.com/doc/refman/5.7/en/using-explain.htmlhttps://dev.mysql.com/doc/refman/5.7/en/explain-output.htmlhttps://dev.mysql.com/doc/refman/5.7/en/explain-extended.html MySQL Locks√ MySQL locks can prevent queries from completing in a timely manner. Check for queries that are holding locks open: mysql> show full processlist; mysql> show open tables where in_use <> 0; mysql> show engine innodb status; OS Memory√ Is the database configured to use enough memory?√ Check for lack of server memory shell> free -m shell> top Physical Disk√ Check for disk i/o contention – this is often the real issue, especially with remote diskshell> iostat -xpne 2√ Add SSD storage into your production systems√ Split filesystems up√ Implement multi-volume striping for improved i/o speed√ Make sure there are enough IOPS if using cloud instances Summary The Wrap-Up In this blog post we discussed Tungsten Replicator applier performance tuning. To learn about Continuent solutions in general, check out https://www.continuent.com/solutions The Library Please read the docs! For more information about monitoring Tungsten clusters, please visit https://docs.continuent.com. Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business! For more information, please visit https://www.continuent.com/solutions Want to learn more or run a POC? Contact us.

    • Exposing MyRocks Internals Via System Variables: Part 5, Data Reads
      In this blog post, we continue on our series of exploring MyRocks mechanics by looking at the configurable server variables and column family options. In our last post, I explained at a high level how compression and bloom filtering are applied to data files as they are initially flushed from immutable memtables and are subsequently passed through the compaction process. With that being covered, we should now have a clear understanding as to how data writing works in MyRocks and can start reviewing how data read requests are handled. The Read Process Let’s start off by talking about how read processes are handled at the file level. When a read request comes in, the first thing it needs to do is pull the data into memory – the block cache specifically – if it’s not already there. Assuming the data is not already available in the block cache, this is the point where a read from write-based sources will occur starting with memtables and then disk. For disk lookups, the process is going to start at the top-most (L0) compaction layer. Given what we’ve covered about compaction, we know that this is where we’ll find the newest version of the data record. Additional lookups will be performed on each compaction layer from top to bottom as needed until the record is found. With the exception of L0, each compaction layer will have non-overlapping data files in sequence. Each file will be able to designate the upper and lower boundaries of its key values. So the very first thing a read process is going to do when it’s reading from a compaction layer is determine which file the key resides in, given that upper and lower boundary. There are also built-in pointers in the SST files that can be used to speed up the candidate file selection process that you can read about here. Once a candidate data file has been established based on the key range, its bloom filter data (filter pages) will be pulled into memory, assuming that bloom filtering is enabled, caching for filter pages is enabled, and the filter pages in question are not already in the cache. If caching is not enabled, the bloom filter data will be read directly from the file without caching. The key value will be hashed and checked against the bloom filter to determine if the key value is in the data file. If bloom filtering shows that the key is not in the file, the data file is ignored and the process moves on to the next compaction layer. If bloom filtering shows the data might be in the file (remember, it can’t be 100% sure) or bloom filtering is disabled, it moves onto the next step. The data file is opened and the top-level index is checked to determine which data block the key can be found in and where that block can be found in the data file. This data block is then pulled into the block cache and then evaluated to determine if the value is present. If not, it will move on to the next compaction layer. Variables and CF_OPTIONS Now that we know the basics of how reads work, let’s take a closer look at the mechanics involved and the variables that control them. Rocksdb_no_block_cache When you request a read from MyRocks, the first thing that it will do is check to see if the data is already in the block cache, except if you have the block cache disabled. This can be controlled by the variable rocksdb_no_block_cache. Default: OFF I would highly recommend that this variable remain enabled as the block cache is an essential feature of MyRocks for reads. The only reason I would consider disabling this is if I had an installation of Percona Server with MyRocks installed but had no intention of using MyRocks. At the time of this writing, Percona Server does come with MyRocks binaries, but it doesn’t come with MyRocks enabled, meaning that you would need to take extra steps with ps-admin to enable MyRocks on your instance. Ultimately I would suggest that the better option here would be to abstain from installing MyRocks if you don’t intend to use it as opposed to installing it and disabling its caches. Rocksdb_block_cache_size Alright, let’s try that again. When you request a read from MyRocks, the first thing it will do is check to see if the data is already in the block cache. If it’s not already in the block cache, it will retrieve the data from memtables or data files using the process we outlined earlier in this post and then load it into the block cache so it can then be parsed as part of the effort to resolve the query in question. For those of you that are familiar with InnoDB this is going to sound a lot like the innodb buffer pool. The block cache will store the uncompressed record data, meaning that if you are using compression at the compaction layer where the data was retrieved, it will be decompressed before it’s loaded into the block cache. The block cache will also hold onto top-level index data, which specifies the range of each data block within the data file. It also holds filter blocks (bloom filter data) comprising the whole bloom filter or just part of it if you are using partitioned bloom filters. See my previous post on bloom filters for more information on partitioned bloom filters. As you can see, the block cache is critical to read processes for MyRocks and it’s important that you give it enough space to work with. The size of this cache is designated by the system variable rocksdb_block_cache_size Default: 536870912 (512 Mb) Much like the InnoDB Buffer pool, you are going to want to expand this variable so it is large enough to fix as much of your uncompressed active data set as possible in memory. I would recommend configuring this to be somewhere between 60 – 75% of your system’s memory capacity. However, you may need to be more conservative when sizing this cache in the case that you’re using compressed data, which we’ll address by explaining Rocksdb_use_direct_reads. Rocksdb_use_direct_reads When data is read from disk, you have the option of getting data directly from the disk with no OS-level caching, but in the case of MyRocks, the operating system level disk cache may be fairly important. The reason is because in MyRocks you are unable to create a second block cache specifically for compressed data despite the fact that this is an available option in RocksDB, at least as far as I was able to determine with my testing. Given that you don’t have the option within the engine itself to cache compressed data pages, you may want to allow the operating system to do that for you. For those of you who are familiar with MYISAM, you may be able to see some similarities here. With MYISAM, the only thing that’s stored in the key cache is index blocks. It’s reliant on the operating system to store data blocks within the disk cache. Similarly, you would be reliant on the operating system disk cache to house compressed MyRocks data pages. This is an important mechanic to understand when looking at the system variable rocksdb_use_direct_reads, given that when it’s enabled you will lose the ability to use the operating system disk cache during data reads. Default: OFF If you have opted to not use any form of compression for your data set (see my previous section on compression and bloom filters), then you may actually stand to see an improvement by enabling this feature as there would be no compressed data that you would need to concern yourself with. However, common use cases for MyRocks almost always include compression, so I think it’s unlikely that you would ever want to enable this variable. Rocksdb_skip_fill_cache There may be situations where you want to read data, but don’t want to have read data pulled into the block cache. The most common example of this when you create a logical dump using a tool like mysqldump or mydumper that will use data reads to facilitate the creation of the backup. During this time you could invalidate a large portion if not all of your active data set to create this backup should this data be cached on read. For those of you who are familiar with InnoDB, their solution was to use the variable innodb_old_blocks_time to state that read data could go to the buffer pool as part of the old page list, but a certain amount of time had to elapse before accessing again would allow it to be moved up to the new page list. This assured that the active dataset within the new page list in the buffer pool would not be overrun by logical backup processes. In MyRocks, things work a little differently as there is a single LRU list for the block cache. If you want to do reads without evicting anything already in the block cache, you can do so by using the rocksdb_skip_fill_cache variable. When set to ‘ON’, data will not be cached; however, with my testing, I did find that a small number of bytes were added to the block cache, but I can only assume that this is top-level index data and filter blocks. Default: OFF Given that this variable can be configured at the session level, I would be resistant to ever changing this variable at a global level. But I would consider setting the variable to ON in my.cnf in the [mysqldump] group. Rocksdb_skip_bloom_filter_on_read MyRocks also gives you the option to skip bloom filters on reads at the global and session level using the variable rocksdb_skip_bloom_filter_on_read. Default: OFF The only use case I can theorize where skipping bloom filters would be useful is if you were doing a rapid succession (serially or concurrently), perhaps as part of a batch process, where you were filtering based on the key of the record and you were 100% certain that the record existed. Remember, MyRocks is going to first determine which data file may have the record in question by checking the upper and lower boundaries of the file. Once that’s complete it will do a bloom filter check to determine if the key is not in the file in question. If you’re completely certain that the key is there, you may consider removing the bloom filter check. Rocksdb_sim_cache_size One common problem for caching of data is determining what the impact will be if you were to change the size of the cache. Assuming you were getting a lot of cache misses, you may wonder how much more memory you need to add to your system in order to increase your cache to get a better hit ratio. Perhaps you have a large hit ratio and you want to see how much you can reduce the block cache in order to make room for other caches such as the write buffer without causing read performance degradation. This is one place where MyRocks truly shines as it comes with what’s called the simulation cache. Using the variable rocksdb_sim_cache_size, you can specify what block-size cache you would like to simulate and then leverage. The insight that this provides could greatly assist you in tuning your configuration for reads, but keep in mind it comes at about a 2% memory cost of the specified size. For example, if you set the simulation cache to 1Gb, in reality, it will consume about 21Mb of space in memory. Default: 0 (disabled) This variable should be used as needed as part of ongoing tuning exercised in order to ensure MyRocks is as efficient as possible. Just keep in mind that this variable is not dynamic, so enabling and disabling the simulated cache will require a restart of MySQL. Rocksdb_collect_sst_properties Another important aspect of reading data is table statistics, which are used by the MySQL optimizer to determine which indexes to use as part of the execution path for the query. If you’re not familiar with the concepts of table statistics or cardinality, I would recommend reading this blog post by Ronald Bradford. Or you can check my blog post on statistics handling for InnoDB as I go into a bit of detail about it there. The important thing to know is how statistics are collected for the engine that you’re working with. In the case of MyRocks, the most common time that stats will be collected is when a data file is created. Due to the nature of compaction, we know that once a data file is written it will never be changed, so this is the optimal time to collect statistics. Whether or not statistics are calculated at this time is controlled by the system variable rocksdb_collect_sst_properties. Default: ON Table statistics are a very important part of the query optimization process so I would highly recommend leaving this in its default enabled state. Rocksdb_table_stats_sampling_pct It’s common for table files to not be fully read as part of the statistics gathering process and instead only a sample of the file is read. This is true for other storage engines like InnoDB that control how much data is collected by noting how many pages should be analyzed. For MyRocks, the sample size of the data file is controlled by the system variable Rocksdb_table_stats_sampling_pct. Default: 10 It’s pretty rare that I’ve ever had to adjust an analysis sample size for statistics gathering and generally would only recommend it if you start experiencing slow queries and discover via the explain plan that the optimizer is selecting a less preferred index. Rocksdb_force_compute_memtable_stats / Rocksdb_force_compute_memtable_stats_cachetime MyRocks also has the option of gathering statistics data from your memtables. As you’ll recall from my first blog post in the series, this is where data is initially written to before it’s flushed to disk. Given that this data exists completely in memory, there is no automated trigger for collecting statistics from the data within the memtable, and instead, it’s collected when a query is run. Once collected, the statistics data will be held for a period of time and during that time there will be no further collection from memtables triggered by queries being run against the instance. Whether or not statistics are collected from memtable is controlled by the system variable rocksdb_force_compute_memtable_stats. The period of time that this data is held is controlled by the system variable rocksdb_force_compute_memtable_stats_cachetime Default: Rocksdb_force_compute_memtable_stats = ON Rocksdb_force_compute_memtable_stats_cachetime = 60000000 (microseconds) (1 minute) If you’re establishing a new instance of MyRocks and don’t have a lot of data, you may want to consider leaving this enabled in order to help you establish a statistics baseline. Once you have a large data set persisted to disk you may want to consider disabling this in order to remove the overhead of dealing with the possibility of statistics gathering at query time. Rocksdb_reset_stats In the case that you’ve lost confidence in the statistics data collected by MyRocks, you can use the system variable rocksdb_reset_stats to flush out it’s statistics data without having to restart the instance. Default: OFF When I tested other variables that instructed the engine to take an action such as rocksdb_force_flush_memtable_now, I would set them to ‘ON’ then check again a moment later and found that they had set themselves back to ‘OFF” after the instruction to take action was received. With this variable, I found that this was not the case as I had to turn it ON and then manually turn it OFF. Associated Metrics Here are some of the metrics you should pay attention to when it comes to data reads. You can find the following information using system status variables. Rocksdb_rows_read: The number of rows read from MyRocks since the last MySQL restart. Keep in mind that if you read a row twice, it will count it as 2 reads. Rocksdb_bytes_read: The total amount of uncompressed bytes read from all MyRocks resources (memtable, cache, data file) since the last MySQL restart. Rocksdb_block_cache_add: The number of blocks of any type added to the block cache since the last MySQL restart. Rocksdb_block_cache_data_add: The number of data blocks added to the block cache since the last MySQL restart. Rocksdb_block_cache_bytes_read: The number of bytes read from the block cache since the last MySQL restart. Rocksdb_block_cache_bytes_write: The number of bytes written to the block cache since the last MySQL restart. Rocksdb_block_cache_data_hit: Shows the number of times that a data block was accessed from the block cache without having to go to disk during a read request since the last MySQL restart. Rocksdb_block_cache_data_miss: Shows the number of times that a data block was not found in the block cache resulting in a request from disk during a read request since the last MySQL restart Rocksdb_block_cache_hit: Shows the total number of block cache hits (data blocks, top-level index blocks, filter blocks, etc) during read requests since the last MySQL restart. Rocksdb_block_cache_miss: Shows the total number of block cache misses (data blocks, top-level index blocks, filter blocks, etc) during read requests since the last MySQL restart. Rocksdb_get_hit_l0: The number of times that read requests got data from compaction layer L0 since the last MySQL restart. Rocksdb_get_hit_l1: The number of times that read requests got data from compaction layer L1 since the last MySQL restart. Rocksdb_get_hit_l2_and_up: The number of times that read requests got data from compaction layer L2 or lower since the last MySQL restart. Rocksdb_no_file_errors: The number of times an error occurred while trying to read data from a data file since the last MySQL restart. Typically, the metrics found in SHOW ENGINE ROCKSDB STATUS will be found in the system status variables, but there are exceptions and you may want to pay attention to the following metrics seen in ROCKSDB STATUS. Rocksdb.sim.block.cache.hit Rocksdb.sim.block.cache.miss These two metrics are leveraged when you enable the simulation cache (see rocksdb_sim_cache_size as mentioned earlier in this blog post). This will let you know what the hit and miss ratio for the simulation cache is and can be used to help you resize your block cache. Additionally, you can get a lot of the read metric information broken down by column family by checking the information_schema.ROCKSDB_PERF_CONTEXT, so be sure to take some time to familiarize yourself with what’s available there. Conclusion In this post, we took a close look at how data is read in the MyRocks engine. It’s my hope that the things we’ve learned here may help you get a better understanding of how reads tie together with compaction, compression, and bloom filters as these critical mechanics come into play at many stages of both data writes and data reads. Stay tuned for my next post where we’re going to have a look at replication specific variables.

    • S.O.L.I.D the First 5 Principles of Object Oriented Design with JavaScript
      S.O.L.I.D Stands for first five object-oriented design principle by Robert C.Martin. SOLID principles can be used to design and develop extensible and easy to maintain software. By using these principles in object-oriented programming developer can create and maintain the codebase very easily. SOLID stands for: Single-responsibility principle Open-closed principle Liskov substitution principle Interface segregation principle Dependency Inversion Principle In this article, we will learn about the principle with the codebase. Single-responsibility principle A class should have only one job. This principle states and recommends that a class should have only one responsibility. If a class contains multiple responsibilities then it becomes coupled. The coupling creates a chain of dependency which is never good for software. This principle also applies to microservices design where we allocate one responsibility to one service. For example, consider this class: class SuperHero {     constructor(name: string){ }     getHeroName() { }     saveHeros(a: Hero) { }} This class violates SRP. Here is why. This class has more than responsibility. It is managing the properties of super hero’s and also handling the database. If there is any update in database management functions then it will affect the properties management functions as well hence resulting in coupling. In order to meet the SRP, we just create another class that will handle the sole responsibility of database management. class SuperHero {     constructor(name: string){ }     getHeroName() { }}class SuperHeroDB {     getHeros(a: Hero) {}     saveHeros(a: Hero) { }} This way our code becomes more cohesive and less coupled. Open-closed principle Software entities(Classes, modules, functions) should be open for extension, not modification. This simply means that the classes, functions should not be modified whenever we need to develop new features. We should extend the entities not modify it. Let’s learn this with our super hero’s class. class SuperHero {     constructor(name: string){ }     getHeroName() { }} We want to iterate through a list of super hero’s and return their weapon of choice. class SuperHero {     constructor(name: string){ }     getHeroName() { // ... }       getWeapons(herosName) {       for(let index = 0; index <= herosName.length; index++) {          if(herosName[index].name === 'thor') {             console.log('storm breaker');          }          if(herosName[index].name === 'captainamerica') {             console.log('Shield');          }       }     }} The function getWeapons() does not meet the open-closed principle because it cannot be closed against new kind of superheroes. if we add a new superhero say Iron man. So we need to change the function and add the new code like this. class SuperHero {     constructor(name: string){ }     getHeroName() { // ... }       getWeapons(herosName) {       for(let index = 0; index <= herosName.length; index++) {          if(herosName[index].name === 'thor') {             console.log('storm breaker');          }          if(herosName[index].name === 'captainamerica') {             console.log('Shield');          }          if(herosName[index].name === 'ironman') {             console.log('the suit');          }       }     }} If you observe, for every new superhero, a new logic is added to the getWeapons() function. As per the open-closed principle, the function should be open for extension, not modification. Here is how we can make the codebase meets the standard to OCP. class SuperHero {     constructor(name: string) { }     getWeapons() { }}class Thor extends SuperHero {      getWeapons() {                   return 'storm breaker';      }}class CaptainAmerica extends SuperHero {      getWeapons() {                 return 'Shield';      }}class Ironman extends SuperHero {      getWeapons() {         return 'Suit';      }}function getWeapons(a: Array<superhero>) {         for (let index = 0 ; index <= a.length; index++) {                 console.log(a[index].getWeapons())         }} getWeapons(superheros); This way we do not need to modify the code whenever a new superhero is required to add. We can just create a class and extends it with the base class. Liskov substitution principle A sub-class must be substitutable for its super-class. This principle states that every subclass/derived class should acts as a substitute to their base/parent class. For example, consider this code. class SuperHero {     constructor(name: string) { }     getWeapons() { }}class Thor extends SuperHero {      getThorWeapons() {                   return 'storm breaker';      }}class CaptainAmerica extends SuperHero {      getCaptainWeapons() {                 return 'Shield';      }}class Ironman extends SuperHero {      getIronManWeapons() {         return 'Suit';      }}function getWeapons(a: Array<superhero>) {         for (let index = 0 ; index <= a.length; index++) {                 console.log(a[index].getWeapons())         }} getWeapons(superheros); The code shown above does not adhere to LSP. To do so, we need to have a function which is in the base class as well as in derived class hence the derived class acts as a substitute to their base class. class SuperHero {     constructor(name: string) { }     getWeapons() { }}class Thor extends SuperHero {      getWeapons() {                   return 'storm breaker';      }}function getWeapons(a: Array<superhero>) {         for (let index = 0 ; index <= a.length; index++) {                 console.log(a[index].getWeapons())         }} getWeapons(superheros); When a new class is derived, it should implement the getWeapons() function, like this. class Ironman extends SuperHero {      getIronManWeapons() {         return 'Suit';      }} Interface segregation principle A Client should not be forced to depend upon interfaces and methods that they do not use. This principle states that the client using the interface should not be forced upon using the methods they don’t need. For example, consider this interface. interface Weapons {     stormBreaker();     Shield();     Hammer();     Suit();} This interface has methods that are dealing with various weapons. If any class try to use this implement, they have to use all the methods. class Thor implements Weapons {        stormBreaker() {}        Shield() {}        Hammer() {}        Suit() {}}class Ironman implements Weapons {        stormBreaker() {}        Shield() {}        Hammer() {}        Suit() {}} If we add a new method in the interface, all the other classes must declare that method or error will be thrown. To make the interface follow LSP, we need to segregate the weapons methods to different interfaces. interface ThorWeapon {     Hammer();}interface IronmanWeapon {     Suit();}interface CaptainAmericaWeapon {     Sheild();}class thor implements ThorWeapon {     Hammer { // code }} This will make sure that the client is only implementing the methods that they should use. Dependency Inversion Principle Depend on Abstractions not on concretions. That means, 1. High-level modules should not depend upon low-level modules. Both should depend upon abstractions. 2. Abstractions should not depend on details. Details should depend upon abstractions. For example, consider this code: const pool = mysql.createPool({// other details});class SuperHero {    constructor(private db: pool) {}    saveHeroes() {       this.db.save();    }} Here, class SuperHero is a high-level component whereas a pool variable is a low-level component. This violates the DIP. In order to make it adhere to the principle, we need to make the following change. interface Connection {   mysql.createPool({ // other details})}class SuperHero {    constructor(private db: Connection) {}    saveHeroes() {       this.db.save();    }} With the little change in the code above, we can see that both the high level and low-level modules depend on abstraction. Conclusion We learned the five principles every software developer must adhere to. It might be little scary at first to follow all these principles, but with steady practice and adherence, it will become a part of us and will have a huge impact on the maintenance of our codebase. If you have any questions/errors/improvements, comment down and let me know. Further Study Facebook Login Implementation Using Nodejs and ExpressTop 5 Node.js Frameworks to Increase Coding Productivity15 Best Visual Studio Code Extensions For Web Development

    • SQL Group By Example | Group By Clause In SQL Tutorial
      SQL Group By Example | Group By Clause In SQL Tutorial is today’s topic. The GROUP BY clause in SQL is used to arrange the same data into groups with the help of some functions. The GROUP BY clause group rows that have the same values into summary rows, like “find the number of customers in each city.” The Group By statement allows you to arrange the rows of a query in the groups. The groups are determined by the table columns that you specify in a GROUP BY clause. SQL Group By Example The GROUP BY statement is also used with the aggregate functions (COUNT, MAX, MIN, SUM, AVG) to group the result by one or more columns. Its main work is to summarize the data from the database. If you want to build some kind of chart then the GROUP BY clause will be helpful to fetch the summarize the data and then draw the chart based on the fetched data. The queries that contain the GROUP BY clause are called grouped queries and only return a single row for every grouped item. Key Points GROUP BY statement is used with a SELECT statement. In the query, GROUP BY statement is placed after the WHERE clause. In the query, GROUP BY statement is placed before ORDER BY clause if used any. Syntax SELECT select_columns FROM table GROUP BY column_name1, column_name2 ,...; In the above query, the GROUP BY statement produced the group for each combination of the values in the columns listed in a GROUP BY clause. Now, let’s go to the example. First, we need to create two tables. If you do not know how to create the table in SQL, then check out SQL Create Table tutorial. Now, create the first table. We have our products dummy data. See the below data.     We have the products table which has ProductID, ProductName, ProductPrice, and created_at columns. Consider the following GROUP BY example. SELECT ProductName, YEAR (created_at) OrderYear FROM Products See the below output.     We have selected the ProductName and just Year from the date. We can add further constraints like Order By. See the following query. SELECT ProductID, ProductName, YEAR (created_at) OrderYear FROM Products ORDER BY ProductID DESC See the following example.   In this example, we retrieved the ProductID, ProductName, and the ordered year of the Products as you can see clearly from the outcome, the Products with the ProductID Order by descending. Now, see the following query. SELECT ProductID, YEAR (created_at) OrderYear, ProductPrice FROM Products GROUP BY ProductPrice ORDER BY ProductPrice DESC See the below output.     In the above query, we have selected three columns, which are Group By ProductPrice. We have used the GROUP By and ORDER BY clause. Now, let’s modify the database. We will add some more data. See the new database.       Let’s use the count aggregate function to count the number of items example. See the below example. SELECT COUNT(ProductID) as TotalID, ProductName, Year(created_at) FROM Products GROUP BY ProductName ORDER BY created_at In the above query, we have selected three columns. We are selecting the no. of ids which have the same number of Products. So, we have used GROUP BY clause with ProductName column name. Thus, the count function will count the no. of rows which has the same products and gives the total and print in the output. See the below output.   In the above output, you can see that the FRONTROW SNEAKER product has appeared two times, LV INITIALS looks only one time.  In practice, the GROUP BY clause is also used with aggregate functions for generating the summary reports. The GROUP BY clause arranges rows into the groups, and the aggregate function returns a summary (count, min, max, average, sum, etc.) for each group. If you want to refer to any other column or expression that is not listed in the GROUP BY statement, you must use that column as an input of the aggregate function. Otherwise, you will get the error because there is no guarantee that the column or expression will return the single value per group. Using GROUP BY clause with the MIN and MAX functions example See the following query. SELECT ProductID, COUNT(ProductID) as Total, ProductName, MIN(ProductPrice) as min_price FROM Products GROUP BY ProductName ORDER BY ProductID In the above query, we are selecting four columns. We are selecting a group by ProductName and count the total products with its minimum price. See the below output.   Conclusively, SQL Group By Example | Group By Clause In SQL Tutorial is over. The post SQL Group By Example | Group By Clause In SQL Tutorial appeared first on AppDividend.