YouTube Icon

Interview Questions.

Top 46 Apache Cassandra Interview Questions - Jul 25, 2022

fluid

Top 46 Apache Cassandra Interview Questions

Q1. Can We Change Replication Factor On A Live Cluster?

Yes, but it will require going for walks repair to alter the reproduction remember of existing facts.

Q2. Explain What Is A Keyspace In Cassandra?

In Cassandra, a keyspace is a namespace that determines facts replication on nodes. A cluster consist of one keyspace per node.

Q3. Explain What Is Composite Type In Cassandra?

In Cassandra, composite kind permits to outline key or a column name with a concatenation of statistics of different type. You can use  forms of Composite Type

Row Key

Column Name

Q4. Explain What Is A Cluster In Cassandra?

A cluster is a field for keyspaces. Cassandra database is segmented over numerous machines that perform together. The cluster is the outermost box which arranges the nodes in a ring format and assigns facts to them. These nodes have a duplicate which takes rate in case of records dealing with failure.

Q5. How Cassandra Stores Data?

All information saved as bytes

When you specify validator, Cassandra guarantees those bytes are encoded as in line with requirement

Then a comparator orders the column based totally at the ordering particular to the encoding

While composite are just byte arrays with a selected encoding, for each thing it shops a  byte duration followed by using the byte encoded component followed via a termination bit.

Q6. What Is Cassandra Data Model?

Cassandra Data Model includes 4 important components:

Cluster: Made up of a couple of nodes and keyspaces

Keyspace: a namespace to group a couple of column households, in particular one consistent with partition

Column: includes a column call, fee and timestamp

ColumnFamily: multiple columns with row key reference.

Q7. Explain The Concept Of Tunable Consistency In Cassandra.?

Tunable Consistency is a phenomenal characteristic that makes Cassandra a desired database desire of Developers, Analysts and Big records Architects. Consistency refers back to the up to date and synchronized records rows on all their replicas. Cassandra’s Tunable Consistency lets in customers to select the consistency degree fine ideal for their use instances. It helps two consistencies -Eventual and Consistency and Strong Consistency.

The former guarantees consistency when no new updates are made on a given facts object, all accesses go back the final updated price sooner or later. Systems with eventual consistency are recognized to have accomplished reproduction convergence.

For Strong consistency, Cassandra supports the subsequent situation:

R + W > N, in which

N – Number of replicas

W – Number of nodes that want to agree for a successful write

R – Number of nodes that want to agree for a a hit study

Q8. Name The Ports Cassandra Uses.?

The default settings country that Cassandra uses 7000 ports for Cluster Management, 9160 for Thrift Clients, 8080 for JMX. These are all TCP ports and may be edited inside the configuration record: bin/Cassandra.In.Sh

Q9. Can You Add Or Remove Column Families In A Working Cluster?

Yes, however retaining in mind the following methods.

Do not neglect to clean the commitlog with ‘nodetool drain’

Turn off Cassandra to test that there's no records left in commitlog

Delete the sstable files for the eliminated CFs

Q10. Mention What Are The Main Components Of Cassandra Data Model?

The essential additives of Cassandra Data Model are

Cluster

Keyspace

Column

Column & Family

Q11. What Platforms Cassandra Runs On?

Since Cassandra is a Java utility, it may correctly run on any Java-driven platform or Java Runtime Environment (JRE) or Java Virtual Machine (JVM). Cassandra also runs on RedHat, CentOS, Debian and Ubuntu Linux platforms.

Q12. What Is Columnfamily?

As the name suggests, ColumnFamily refers to a structure having infinite wide variety of rows. That are referred by using a key-value pair, wherein secret is the name of the column and price represents the column information. It is a good deal similar to a hashmap in java or dictionary in Python. Rememeber, the rows are not restrained to a predefined list of Columns here. Also, the ColumnFamily is surely flexible with one row having one hundred Columns even as the opposite simplest 2 columns.

Q13. List Out The Other Components Of Cassandra?

The different components of Cassandra are

Node

Data Center

Cluster

Commit log

Mem-table

SSTable

Bloom Filter

Q14. Mention What Are The Values Stored In The Cassandra Column?

In Cassandra Column, essentially there are three values

Column Name

Value

Time Stamp

Q15. Define Memtable.?

Similar to desk, memtable is in-memory/write-again cache space such as content in key and column layout. The records in memtable is looked after through key, and every ColumnFamily encompass a wonderful memtable that retrieves column statistics thru key. It stores the writes till it is complete, after which flushed out.

Q16. Mention What Needs To Be Taken Care While Adding A Column?

While adding a column you need to take care that the

Column call isn't always conflicting with the prevailing column names

Table isn't described with compact storage alternative

Q17. What Is Cql?

CQL is Cassandra Query language to get admission to and question the Apache distributed database. It includes a CQL parser that incites all of the implementation info to the server. The syntax of CQL is just like SQL but it does no longer modify the Cassandra records model.

Q18. List The Benefits Of Using Cassandra.?

Unlike conventional or any other database, Apache Cassandra can provide near real-time performance simplifying the work of Developers, Administrators, Data Analysts and Software Engineers.

Instead of master-slave structure, Cassandra is set up on peer-to-peer structure ensuring no failure.

It additionally assures phenomenal flexibility because it lets in insertion of more than one nodes to any Cassandra cluster in any datacenter. Further, any patron can ahead its request to any server.

Cassandra facilitates extensible scalability and may be without problems scaled up and scaled down as consistent with the requirements. With a excessive throughput for study and write operations, this NoSQL software need not be restarted at the same time as scaling.

Cassandra is also revered for its sturdy information replication capability as it permits facts storage at a couple of places allowing users to retrieve records from some other region if one node fails. Users have the choice to set up the variety of replicas they want to create.

Shows first rate performance when used for massive datasets and thus, the most most popular NoSQL DB by maximum businesses.

Operates on column-oriented structure and for this reason, hurries up and simplifies the procedure of slicing. Even information get admission to and retrieval will become extra green with column-based totally records version.

Further, Apache Cassandra supports schema-unfastened/schema-non-obligatory data version, which un-necessitate the reason of displaying all the columns required by your software.

Q19. State The Differences Between A Node, A Cluster And Datacenter In Cassandra.?

While a node is a unmarried device strolling Cassandra, cluster is a set of nodes which have comparable type of statistics grouped collectively. DataCentersare useful additives whilst serving clients in different geographical areas. You can group exceptional nodes of a cluster into one of a kind data facilities.

Q20. How To Write A Query In Cassandra?

Using CQL (Cassandra Query Language).Cqlsh is used for interacting with database.

Q21. What Is Replication Factor In Cassandra?

Replication Factor is the measure of quantity of facts copies present. It is crucial to growth the replication factor to log into the cluster.

Q22. Explain What Is A Column Family In Cassandra?

Column family in Cassandra is referred for a group of Rows.

Q23. What Is Mandatory While Creating A Table In Cassandra?

While developing a desk number one key's obligatory, it's far made up of 1 or more columns of a table.

Q24. Mention When You Can Use Alter Keyspace?

ALTER KEYSPACE can be used to alternate residences consisting of the variety of replicas and the durable_write of a keyspace.

Q25. Define The Use Of Source Command In Cassandra.?

Source command is used to execute a record which include CQL statements.

Q26. Explain What Is Cassandra-cqlsh?

Cassandra-Cqlsh is a question language that enables customers to speak with its database. By the usage of Cassandra cqlsh, you could do following things

Define a schema

Insert a statistics and

Execute a query

Q27. What Is Sstable? How Is It Different From Other Relational Tables?

SSTable expands to ‘Sorted String Table,’ which refers to an essential information file in Cassandra and accepts normal written memtables. They are saved on disk and exist for every Cassandra desk. Exhibiting immutability, SStables do not allow any similarly addition and elimination of records items as soon as written. For each SSTable, Cassandra creates 3 separate files like partition index, partition precis and a bloom clear out.

Q28. Define The Management Tools In Cassandra.?

DataStaxOpsCenter: internet-based control and tracking answer for Cassandra cluster and DataStax. It is free to download and includes an extra Edition of OpsCenter

SPM mainly administers Cassandra metrics and diverse OS and JVM metrics. Besides Cassandra, SPM also video display units Hadoop, Spark, Solr, Storm, zookeeper and different Big Data structures. The essential capabilities of SPM encompass correlation of activities and metrics, allotted traction tracing, growing actual-time graphs with zooming, anomaly detection and heartbeat alerting.

Q29. Explain The Concept Of Bloom Filter.?

Associated with SSTable, Bloom clear out is an off-heap (off the Java heap to native reminiscence) data shape to check whether there may be any data to be had in the SSTable before appearing any I/O disk operation.

Q30. Explain What Is Cassandra?

Cassandra is an open supply information garage system advanced at Facebook for inbox seek and designed for storing and coping with massive amounts of information across commodity servers. It can server as both Real time facts shop machine for on line applications Also as a study intensive database for enterprise intelligence gadget

Q31. What Is The Use Of Cassandra And Why To Use Cassandra?

Cassandra turned into designed to address massive records workloads across more than one nodes with none unmarried point of failure. The different factors responsible for the usage of Cassandra are

It is fault tolerant and constant

Gigabytes to petabytes scalabilities

It is a column-oriented database

No unmarried point of failure

No want for separate caching layer

Flexible schema layout

It has bendy statistics storage, smooth statistics distribution, and speedy writes

It supports ACID (Atomicity, Consistency, Isolation, and Durability)residences

Multi-records middle and cloud succesful

Data compression

Q32. Define The Consistency Levels For Read Operations In Cassandra.?

ALL: Highly regular. A write need to be written to commitlog and memtable on all reproduction nodes inside the cluster

EACH_QUORUM: A write must be written to commitlog and memtable on quorum of duplicate nodes in all facts facilities.

LOCAL_QUORUM:A write should be written to commitlog and memtable on quorum of duplicate nodes in the equal center.

ONE: A write should be written to commitlog and memtableof at the least one duplicate node.

TWO, Three: Same as One but at least  and three duplicate nodes, respectively

LOCAL_ONE: A write ought to be written for at least one reproduction node inside the local information middle ANY

SERIAL: Linearizable Consistency to prevent unconditional updates

LOCAL_SERIAL: Same as Serial but constrained to neighborhood statistics middle

Q33. Does Cassandra Support Acid Tractions?

Unlike relational databases, Cassandra does now not help ACID tractions.

Q34. Explain Cap Theorem?

With a sturdy requirement to scale structures while extra assets are wished, CAP Theorem plays a major role in keeping the scaling method. It is a good way to handle scaling in dispensed systems. Consistency Availability and Partition tolerance (CAP) theorem states that in allotted structures like Cassandra, customers can enjoy most effective two out of those 3 traits.

One of them desires to be sacrificed. Consistency guarantees the return of maximum recent write for the purchaser, Availability returns a rational reaction inside minimum time and in Partition Tolerance, the device will continue its operations while network walls occur. The  alternatives to be had are AP and CP.

Q35. Explain What Is Memtable In Cassandra?

Cassandra writes the records to a in reminiscence structure referred to as Memtable

It is an in-memory cache with content material saved as key/column

By key Memtable statistics are taken care of

There is a separate Memtable for every ColumnFamily, and it retrieves column records from the key

Q36. What Is Difference Between Column And Super Column?

Both elements paintings on the principle of tuple having name and value. However, the previous‘s cost is a string while the value in latter is a Map of Columns with distinctive statistics sorts.

Unlike Columns, Super Columns do now not include the 0.33 issue of timestamp.

Q37. How To Iterate All Rows In Columnfamily?

Using get_range_slices. You can begin new release with the empty string and after every new release, the final key study serves as the start key for subsequent new release.

Q38. What Is Supercolumn In Cassandra?

Cassandra Super Column is a unique detail consisting of similar collections of facts. They are absolutely key-value pairs with values as columns. It is a taken care of array of columns, and they comply with a hierarchy when in action: keystore> column family> extremely good column> column facts shape in JSON.

Similar to row keys, incredible column information entries includes no unbiased values but are used to accumulate other columns. It is interesting to word that wonderful column keys appearing in exclusive rows do now not necessarily in shape and will no longer ever.

Q39. Explain The Concept Of Compaction In Cassandra.?

Compaction refers to a protection technique in Cassandra , wherein, the SSTables are reorganized for information optimization of statistics structure son the disk. The compaction method is useful at some stage in interactive with memtable. There are  kind sof compaction in Cassandra:

Minor compaction: started out automatically while a new sstable is created. Here, Cassandra condenses all the similarly sized sstables into one.

Major compaction : is brought about manually the use of nodetool. Compacts all sstables of a ColumnFamily into one.

Q40. What Is Thrift?

Thrift is a legacy RPC protocol or API unified with a code technology device for CQL. The motive of using Thrift in Cassandra is to facilitate get right of entry to to the DB throughout the programming language.

Q41. Explain How Cassandra Writes Data?

Cassandra writes records in 3 additives

Commitlog write

Memtable write

SStable write

Q42. How Does Cassandra Write?

Cassandra performs the write feature through applying  commits-first it writes to a commit log on disk after which commits to an in-reminiscence established known as memtable. Once the 2 commits are a hit, the write is completed. Writes are written within the table structure as SSTable (looked after string table). Cassandra offers quicker write overall performance.

Q43. Explain Tombstone In Cassandra.?

Tombstone is row marker indicating a column deletion. These marked columns are deleted for the duration of compaction. Tombstones are of extraordinary importance as Cassnadra helps eventual consistency, wherein the facts have to respond before any a hit operation.

Q44. Explain How Cassandra Delete Data?

SSTables are immutable and cannot dispose of a row from SSTables. When a row wishes to be deleted, Cassandra assigns the column value with a unique fee referred to as Tombstone. When the data is read, the Tombstone fee is considered as deleted.

Q45. What Os Cassandra Supports?

Windows and Linux.

Q46. Explain How Cassandra Writes Changed Data Into Commitlog?

Cassandra concatenate modified records to commitlog

Commitlog acts as a crash healing log for data

Until the changed information is concatenated to commitlog write operation may be in no way considered successful

Data will no longer be misplaced once commitlog is flushed out to report.




CFG