YouTube Icon

Interview Questions.

Top 100+ Apache Cassandra Interview Questions And Answers - May 26, 2020

fluid

Top 100+ Apache Cassandra Interview Questions And Answers

Question 1. Explain What Is Cassandra?

Answer :

Cassandra is an open source information garage system advanced at Facebook for inbox search and designed for storing and coping with big amounts of facts throughout commodity servers. It can server as each Real time records shop device for on-line programs Also as a read intensive database for enterprise intelligence gadget

Question 2. List The Benefits Of Using Cassandra.?

Answer :

Unlike traditional or some other database, Apache Cassandra promises close to real-time performance simplifying the paintings of Developers, Administrators, Data Analysts and Software Engineers.

Instead of master-slave structure, Cassandra is installed on peer-to-peer structure ensuring no failure.
It also assures exceptional flexibility as it allows insertion of a couple of nodes to any Cassandra cluster in any datacenter. Further, any client can ahead its request to any server.
Cassandra facilitates extensible scalability and may be without difficulty scaled up and scaled down as in line with the necessities. With a excessive throughput for examine and write operations, this NoSQL software want now not be restarted while scaling.
Cassandra is likewise revered for its robust information replication capability as it allows records garage at a couple of places enabling customers to retrieve information from any other vicinity if one node fails. Users have the choice to installation the quantity of replicas they want to create.
Shows high-quality overall performance while used for massive datasets and for this reason, the most optimal NoSQL DB with the aid of most corporations.
Operates on column-oriented structure and as a consequence, hurries up and simplifies the system of cutting. Even records get right of entry to and retrieval turns into more efficient with column-based facts version.
Further, Apache Cassandra supports schema-loose/schema-optionally available records model, which un-necessitate the reason of showing all of the columns required via your application.
Apache Tapestry Interview Questions
Question three. What Is The Use Of Cassandra And Why To Use Cassandra?

Answer :

Cassandra became designed to handle huge information workloads throughout multiple nodes without any single point of failure. The different factors accountable for the usage of Cassandra are

It is fault tolerant and constant
Gigabytes to petabytes scalabilities
It is a column-oriented database
No unmarried point of failure
No want for separate caching layer
Flexible schema design
It has bendy facts storage, clean records distribution, and fast writes
It supports ACID (Atomicity, Consistency, Isolation, and Durability)houses
Multi-facts middle and cloud succesful
Data compression
Question four. Explain The Concept Of Tunable Consistency In Cassandra.?

Answer :

Tunable Consistency is an exceptional characteristic that makes Cassandra a favored database desire of Developers, Analysts and Big data Architects. Consistency refers back to the up to date and synchronized facts rows on all their replicas. Cassandra’s Tunable Consistency allows users to select the consistency level exceptional perfect for his or her use cases. It helps two consistencies -Eventual and Consistency and Strong Consistency.

The former ensures consistency while no new updates are made on a given facts object, all accesses go back the last up to date value subsequently. Systems with eventual consistency are recognized to have finished duplicate convergence.

For Strong consistency, Cassandra supports the subsequent circumstance:

R + W > N, wherein

N – Number of replicas
W – Number of nodes that need to agree for a a success write
R – Number of nodes that want to agree for a successful read
Apache Tapestry Tutorial
Question 5. Explain What Is Composite Type In Cassandra?

Answer :

In Cassandra, composite type permits to outline key or a column name with a concatenation of facts of different type. You can use  types of Composite Type

Row Key
Column Name
Apache Spark Interview Questions
Question 6. How Does Cassandra Write?

Answer :

Cassandra performs the write characteristic through applying  commits-first it writes to a commit go browsing disk after which commits to an in-reminiscence dependent called memtable. Once the 2 commits are a success, the write is finished. Writes are written in the table shape as SSTable (sorted string table). Cassandra gives speedier write overall performance.

Question 7. How Cassandra Stores Data?

Answer :

All information saved as bytes
When you specify validator, Cassandra ensures the ones bytes are encoded as consistent with requirement
Then a comparator orders the column based totally at the ordering precise to the encoding
While composite are just byte arrays with a specific encoding, for every issue it stores a  byte length followed by way of the byte encoded element observed by using a termination bit.
Apache Cassandra Tutorial Apache Solr Interview Questions
Question eight. Define The Management Tools In Cassandra.?

Answer :

DataStaxOpsCenter: net-based totally management and monitoring solution for Cassandra cluster and DataStax. It is unfastened to download and includes a further Edition of OpsCenter

SPM usually administers Cassandra metrics and numerous OS and JVM metrics. Besides Cassandra, SPM also monitors Hadoop, Spark, Solr, Storm, zookeeper and other Big Data structures. The foremost capabilities of SPM include correlation of activities and metrics, disbursed transaction tracing, creating actual-time graphs with zooming, anomaly detection and heartbeat alerting.

Question 9. Mention What Are The Main Components Of Cassandra Data Model?

Answer :

The major additives of Cassandra Data Model are

Cluster
Keyspace
Column
Column & Family
Apache Storm Interview Questions
Question 10. Define Memtable.?

Answer :

Similar to desk, memtable is in-memory/write-again cache space which includes content in key and column format. The facts in memtable is taken care of with the aid of key, and each ColumnFamily consist of a wonderful memtable that retrieves column records through key. It stores the writes until it is full, and then flushed out.

Apache Solr Tutorial
Question eleven. Explain What Is A Column Family In Cassandra?

Answer :

Column own family in Cassandra is referred for a group of Rows.

Apache Hive Interview Questions
Question 12. What Is Sstable? How Is It Different From Other Relational Tables?

Answer :

SSTable expands to ‘Sorted String Table,’ which refers to an critical statistics file in Cassandra and accepts regular written memtables. They are stored on disk and exist for every Cassandra desk. Exhibiting immutability, SStables do no longer permit any in addition addition and removal of information gadgets once written. For every SSTable, Cassandra creates three separate files like partition index, partition summary and a bloom filter out.

Apache Tapestry Interview Questions
Question thirteen. Explain What Is A Cluster In Cassandra?

Answer :

A cluster is a container for keyspaces. Cassandra database is segmented over several machines that operate collectively. The cluster is the outermost field which arranges the nodes in a ring layout and assigns information to them. These nodes have a duplicate which takes charge in case of facts coping with failure.

Apache Storm Tutorial
Question 14. Explain The Concept Of Bloom Filter.?

Answer :

Associated with SSTable, Bloom filter is an off-heap (off the Java heap to native reminiscence) facts shape to test whether or not there is any information available in the SSTable before performing any I/O disk operation.

Question 15. List Out The Other Components Of Cassandra?

Answer :

The different components of Cassandra are

Node
Data Center
Cluster
Commit log
Mem-desk
SSTable
Bloom Filter
Apache Pig Interview Questions
Question sixteen. Explain Cap Theorem?

Answer :

With a strong requirement to scale structures when extra assets are needed, CAP Theorem plays a major role in maintaining the scaling strategy. It is an effective manner to address scaling in dispensed structures. Consistency Availability and Partition tolerance (CAP) theorem states that during allotted structures like Cassandra, customers can experience simplest two out of those three characteristics.

One of them desires to be sacrificed. Consistency ensures the go back of most latest write for the patron, Availability returns a rational reaction within minimum time and in Partition Tolerance, the system will continue its operations whilst network walls arise. The two options to be had are AP and CP.

Apache Hive Tutorial
Question 17. Explain What Is A Keyspace In Cassandra?

Answer :

In Cassandra, a keyspace is a namespace that determines records replication on nodes. A cluster include one keyspace consistent with node.

Apache Flume Interview Questions
Question 18. State The Differences Between A Node, A Cluster And Datacenter In Cassandra.?

Answer :

While a node is a single gadget going for walks Cassandra, cluster is a set of nodes which have comparable form of statistics grouped together. DataCentersare useful components when serving clients in different geographical regions. You can institution different nodes of a cluster into extraordinary statistics facilities.

Apache Spark Interview Questions
Question 19. Mention What Are The Values Stored In The Cassandra Column?

Answer :

In Cassandra Column, basically there are 3 values

Column Name
Value
Time Stamp
Apache Pig Tutorial
Question 20. How To Write A Query In Cassandra?

Answer :

Using CQL (Cassandra Query Language).Cqlsh is used for interacting with database.

Apache Kafka Interview Questions
Question 21. Mention When You Can Use Alter Keyspace?

Answer :

ALTER KEYSPACE can be used to exchange homes which include the quantity of replicas and the durable_write of a keyspace.

Question 22. What Os Cassandra Supports?

Answer :

Windows and Linux.

Apache Flume Tutorial
Question 23. Explain What Is Cassandra-cqlsh?

Answer :

Cassandra-Cqlsh is a query language that enables users to communicate with its database. By using Cassandra cqlsh, you can do following matters

Define a schema
Insert a records and
Execute a question
Apache Ant Interview Questions
Question 24. What Is Cassandra Data Model?

Answer :

Cassandra Data Model includes 4 principal components:

Cluster: Made up of more than one nodes and keyspaces
Keyspace: a namespace to institution a couple of column families, particularly one per partition
Column: includes a column name, value and timestamp
ColumnFamily: a couple of columns with row key reference.
Apache Solr Interview Questions
Question 25. Mention What Does The Shell Commands “capture” And “consistency” Determines?

Answer :

There are various Cqlsh shell instructions in Cassandra. Command “Capture”, captures the output of a command and adds it to a record even as, command “Consistency” display the present day consistency stage or set a brand new consistency level.

Apache Kafka Tutorial
Question 26. What Is Cql?

Answer :

CQL is Cassandra Query language to access and query the Apache disbursed database. It includes a CQL parser that incites all of the implementation information to the server. The syntax of CQL is similar to SQL however it does not adjust the Cassandra information model.

Apache Camel Interview Questions
Question 27. What Is Mandatory While Creating A Table In Cassandra?

Answer :

While developing a table primary secret is obligatory, it's far made up of 1 or extra columns of a table.

Apache Storm Interview Questions
Question 28. Explain The Concept Of Compaction In Cassandra.?

Answer :

Compaction refers to a renovation system in Cassandra , wherein, the SSTables are reorganized for facts optimization of information shape son the disk. The compaction process is beneficial throughout interactive with memtable. There are two type sof compaction in Cassandra:

Minor compaction: started automatically whilst a brand new sstable is created. Here, Cassandra condenses all of the similarly sized sstables into one.
Major compaction : is triggered manually the use of nodetool. Compacts all sstables of a ColumnFamily into one.
Apache Ant Tutorial
Question 29. Mention What Needs To Be Taken Care While Adding A Column?

Answer :

While adding a column you want to take care that the

Column call is not conflicting with the present column names
Table isn't always described with compact garage choice
Apache Tajo Interview Questions
Question 30. Does Cassandra Support Acid Transactions?

Answer :

Unlike relational databases, Cassandra does not support ACID transactions.

Question 31. Explain How Cassandra Writes Data?

Answer :

Cassandra writes information in three additives

Commitlog write
Memtable write
SStable write
Apache Tajo Tutorial
Question 32. What Is Supercolumn In Cassandra?

Answer :

Cassandra Super Column is a completely unique element consisting of comparable collections of data. They are certainly key-value pairs with values as columns. It is a taken care of array of columns, and they follow a hierarchy when in motion: keystore> column own family> super column> column records structure in JSON.

Similar to row keys, extraordinary column information entries contains no independent values however are used to accumulate other columns. It is thrilling to notice that extremely good column keys appearing in special rows do not always suit and will now not ever.

Apache Impala Interview Questions
Question 33. Explain What Is Memtable In Cassandra?

Answer :

Cassandra writes the statistics to a in memory shape referred to as Memtable
It is an in-memory cache with content saved as key/column
By key Memtable data are sorted
There is a separate Memtable for every ColumnFamily, and it retrieves column records from the key
Apache Hive Interview Questions
Question 34. Define The Consistency Levels For Read Operations In Cassandra.?

Answer :

ALL: Highly constant. A write have to be written to commitlog and memtable on all duplicate nodes within the cluster
EACH_QUORUM: A write need to be written to commitlog and memtable on quorum of replica nodes in all facts facilities.
LOCAL_QUORUM:A write must be written to commitlog and memtable on quorum of duplicate nodes within the identical center.
ONE: A write need to be written to commitlog and memtableof as a minimum one replica node.
TWO, Three: Same as One but as a minimum  and three reproduction nodes, respectively
LOCAL_ONE: A write have to be written for at least one duplicate node in the local facts middle ANY
SERIAL: Linearizable Consistency to prevent unconditional updates
LOCAL_SERIAL: Same as Serial however confined to neighborhood statistics middle
Question 35. Explain How Cassandra Writes Changed Data Into Commitlog?

Answer :

Cassandra concatenate modified statistics to commitlog

Commitlog acts as a crash recovery log for statistics
Until the modified data is concatenated to commitlog write operation will be by no means taken into consideration a success
Data will no longer be lost once commitlog is flushed out to file.

Question 36. What Is Difference Between Column And Super Column?

Answer :

Both factors work on the precept of tuple having call and cost. However, the previous‘s value is a string even as the cost in latter is a Map of Columns with special records kinds.

Unlike Columns, Super Columns do not comprise the 0.33 aspect of timestamp.

Apache Pig Interview Questions
Question 37. What Is Columnfamily?

Answer :

As the name shows, ColumnFamily refers to a structure having infinite range of rows. That are referred through a key-price pair, where secret's the name of the column and value represents the column information. It is plenty just like a hashmap in java or dictionary in Python. Rememeber, the rows are not confined to a predefined listing of Columns here. Also, the ColumnFamily is simply flexible with one row having one hundred Columns while the opposite best 2 columns.

Question 38. Explain How Cassandra Delete Data?

Answer :

SSTables are immutable and cannot remove a row from SSTables. When a row needs to be deleted, Cassandra assigns the column price with a unique value called Tombstone. When the facts is read, the Tombstone price is taken into consideration as deleted.

Question 39. Define The Use Of Source Command In Cassandra.?

Answer :

Source command is used to execute a report which include CQL statements.

Question forty. What Is Thrift?

Answer :

Thrift is a legacy RPC protocol or API unified with a code technology device for CQL. The motive of the usage of Thrift in Cassandra is to facilitate get admission to to the DB throughout the programming language.

Apache Flume Interview Questions
Question 41. Explain Tombstone In Cassandra.?

Answer :

Tombstone is row marker indicating a column deletion. These marked columns are deleted in the course of compaction. Tombstones are of fantastic importance as Cassnadra supports eventual consistency, in which the data have to respond before any a success operation.

Question forty two. What Platforms Cassandra Runs On?

Answer :

Since Cassandra is a Java application, it may efficiently run on any Java-driven platform or Java Runtime Environment (JRE) or Java Virtual Machine (JVM). Cassandra also runs on RedHat, CentOS, Debian and Ubuntu Linux systems.

Apache Kafka Interview Questions
Question 43. Name The Ports Cassandra Uses.?

Answer :

The default settings state that Cassandra uses 7000 ports for Cluster Management, 9160 for Thrift Clients, 8080 for JMX. These are all TCP ports and can be edited within the configuration file: bin/Cassandra.In.Sh

Question forty four. Can You Add Or Remove Column Families In A Working Cluster?

Answer :

Yes, but keeping in thoughts the following techniques.

Do not overlook to clean the commitlog with ‘nodetool drain’
Turn off Cassandra to test that there's no records left in commitlog
Delete the sstable files for the eliminated CFs
Question forty five. What Is Replication Factor In Cassandra?

Answer :

Replication Factor is the measure of quantity of statistics copies existing. It is vital to growth the replication thing to log into the cluster.

Question forty six. Can We Change Replication Factor On A Live Cluster?

Answer :

Yes, but it'll require strolling restore to modify the replica rely of present facts.

Question forty seven. How To Iterate All Rows In Columnfamily?

Answer :

Using get_range_slices. You can begin generation with the empty string and after every new release, the last key study serves because the begin key for next generation.




CFG