Computer Science

B Tree

A B-tree is a data structure used for organizing and storing data in a database. It is a balanced tree structure where each node can have multiple children and is used to optimize disk reads and writes. B-trees are commonly used in file systems and databases.

Written by Perlego with AI-assistance

9 Key excerpts on "B Tree"

  • Disk-Based Algorithms for Big Data
    • Christopher Healey(Author)
    • 2016(Publication Date)
    • CRC Press
      (Publisher)
    1 Significant research followed in the 1970s to improve upon the initial B-tree algorithms. In the 1980s B-trees were applied to database management systems, with an emphasis on how to ensure consistent B-tree access patterns to support concurrency control and data recovery. More recently, B-trees have been applied to disk management, to support efficient I/O and file versioning. For example, ZFS is built on top of an I/O efficient B-tree implementation.
    The original B-tree algorithm was designed to support efficient access and maintenance of an index that is too large to hold in main memory. This led to three basic goals for a B-tree.
    1.  Increase the tree’s node size to minimize seeks, and maximize the amount of data transferred on each read. 2.  Reduce search times to a very few seeks, even for large collections. 3.  Support efficient local insertion, search, and deletion.
    FIGURE 6.4 An order-4 B-tree with 3 keys per node, constructed from letters of the alphabet in a random order
    FIGURE 6.5 An order-1001 B-tree with 1000 keys per node; three levels yield enough space for about 1.1 billion keys
    The key insight of B-trees is that the tree should be built bottom-up, and not top-down. We begin by inserting keys into a single leaf node. When this leaf node over-flows, we split it into two half-full leaves and promote a single key upwards to form a new root node. Critically, since we defer the promotion until the leaf overflows, we can pick the key that does the best job of partitioning the leaf. This split–promote operation continues throughout the life of the B-tree.
    A B-tree is a generalization of a BST. Rather than holding 1 key and pointers to two subtrees at each node, we hold up to k − 1 keys and k subtree references. This is called an order-k B-tree. Using this terminology, a BST is an order-2 B-tree. Figure 6.4 shows an order-4 B-tree used to store the same collection of keys we inserted into the paged BST in Figure 6.3 .
    Although our examples have low order, a B-tree node will normally hold hundreds or even thousands of keys per node, with each node sized to fill one or more disk pages. Figure 6.5
  • Hands-On Data Structures and Algorithms with Python
    BST ) is a special kind of binary tree. It is one of the most important and commonly used data structures in computer science applications. A binary search tree is a tree that is structurally a binary tree, and stores data in its nodes very efficiently. It provides very fast search, insertion, and deletion operations.
    A binary tree is called a binary search tree if the value at any node in the tree is greater than the values in all the nodes of its left subtree, and less than (or equal to) the values of all the nodes of the right subtree. For example, if K1 , K2 , and K3 are key values in a tree of three nodes (as shown in Figure 6.22 ), then it should satisfy the following conditions:
    • The key values K2<=K1
    • The key values K3>K1
    The following figure depicts the above condition of the binary search tree: Figure 6.22: An example of a binary search tree
    Let’s consider another example so that we have a better understanding of binary search trees. Consider the binary search tree shown in Figure 6.23 :
    Figure 6.23: Binary search tree of six nodes
    In this tree, all the nodes in the left subtree are less than (or equal to) the value of the parent node. All the nodes in the right subtree of this node are also greater than that of the parent node.
    To see if the above example tree fulfills the properties of a binary search tree, we see that all the nodes in the left subtree of the root node have a value less than 5 . Likewise, all the nodes in the right subtree have a value that is greater than 5 . This property applies to all the nodes in the tree with no exceptions. For example, if we take another node with the value 3 , we can see that the values for all the left subtree nodes are less than the value 3 and the values for all the right subtree nodes are greater than 3 .
    Considering another example of a binary tree. Let’s check to see if it is a binary search tree. Despite the fact that the following diagram, Figure 6.24 , looks similar to the previous diagram, it does not qualify as a binary search tree, as node 7 is greater than the root node 5 ; even though it is located in the left subtree of the root node. Node 4 is to the right subtree of its parent node 7 , which is also violating a rule of binary search trees. Thus, the following figure, Figure 6.24
  • PHP 7 Data Structures and Algorithms
    A red-black tree is a self-balanced binary search tree with some extra properties, which is the color. Each node in the binary tree stores one extra bit of information, which is known as color and can have either red or black as values. Like an AVL tree, a red-black tree is also used for real-time applications as the average and worst case complexity is also logarithmic. A sample red-black tree looks like this:
    Passage contains an image

    B-tree

    A B-tree is a special type of binary tree, which is self-balanced. This is not the same as a self-balanced binary search tree. The key difference is that in a B-tree, we can have any number of nodes as child nodes, not just two. A B-tree is used for a large set of data and is mainly used in filesystems and databases. The complexity of different operations in a B-tree is logarithmic.
    Passage contains an image

    N-ary Tree

    An N-ary tree is a special type of tree where a node can have maximum N children. This is also known as a k-way tree or M-ary tree. A binary tree is an N-ary tree where the value of N is 2.
    Passage contains an image

    Understanding a binary tree

    We always get confused with binary trees and binary search trees. As we have seen in the definition, BST is a sorted binary tree. If it is sorted, then we can have the performance improvement compared to a regular binary tree. Each binary tree node can have a maximum of two child nodes, which are known as the left child node and right child node. However, based on the type of binary tree, there can be zero, one, or two child nodes.
    We can also classify binary trees into different categories:
    • Full binary tree: A full binary tree is a tree that has either zero or two child nodes on each node. A full binary tree is also known as a proper tree or a plane binary tree.
    • Perfect binary tree : A perfect binary tree is a binary tree in which all internal nodes have exactly two child nodes and all leaves have the same level or depth.
    • Complete binary tree:
  • Data Structures and Algorithms Implementation Through C
    B-Tree is a self-balanced search tree with multiple keys in every node and more than two children for every node. In a B-tree, all leaf nodes should be in the same level. It comes under the balancing property.
    If M-order B-tree then:
    1. All nodes other than root should contain M/2 children minimum and M/2 -1 keys minimum.
    2. All nodes should contain M children maximum and M-1 keys maximum.
    The maximum no of children is called degree or order. Let's take an example for understanding B-Tree.
    Example: Construct a B-tree having degree 5 and following keys:
    5, 10, 12, 13, 14, 1, 2, 3, 4.
    Solution: In the previous property, the maximum children are 5 and maximum keys are 4. The minimum children is 5/2 and minimum keys 5/2-1.
    The minimum children are 5/2 =2.5 take 3. The minimum keys is 2. There are following steps to construct a B-tree.

    10.18 B+ Tree

    B+ trees is an extension to B Trees. There are following advantages of B+ tree. It is similar to B Trees, with few differences.
    1. The B + -Tree consists of two types of nodes (1) internal nodes and (2) leaf nodes
    2. Internal nodes point to other nodes in the tree.
    3. Leaf nodes point to data in the database by using data pointers. The data is stored in the leaf nodes and all other nodes store in the indexes.
    4. Leaf nodes are linked to each other by using sibling pointer in sequential manner to form a linked list.
    5. Only leaf nodes needs to be traversed to scan the entire tree as data is present only in the leaf nodes without visiting the higher nodes at all reducing the block accesses to a great extent.
    6. Traversal is faster as compared to B Trees in which the data is present in all the nodes which in turn would require more number if block accesses.
    7. Just like B Trees, B+ trees are also balanced trees (every path from root node to leaf node has same length) and every node except the root must be at least half full. Root may contain a minimum of two entries.
  • Graph Theory and Its Applications
    • Jonathan L. Gross, Jay Yellen, Mark Anderson(Authors)
    • 2018(Publication Date)
    key, whose value determines that entry’s position in the database.
    Thus, an entry is located in a random-access table by searching for its key. The most generally useful implementation of a random-access table uses the following information structure.
    DEFINITION : A binary-search tree (BST) is a binary tree, each of whose vertices is assigned a key, such that the key assigned to any vertex v is greater than the key at each vertex in the left subtree of v, and is less than the key at each vertex in the right subtree of v.
    Example 3.4.1: Both of the binary-search trees in Figure 3.4.1 store the keys:
    3 , 8 , 9 , 12 , 14 , 21 , 22 , 23 , 28 , 35 , 40 , 46
    Figure 3.4.1 Two binary-search trees storing the same set of keys.
    Notice that the smallest key in either BST can be found by starting at the root and proceeding to the left until you reach a vertex with no left-child. Similarly, the largest key can be found by proceeding from the root iteratively to the right as long as possible. A straightforward inductive proof can be used to show that these two properties hold for an arbitrary binary-search tree (see Exercises).
    Iterative and recursive versions of the binary-search algorithm are shown below. The algorithm is based on the following simple strategy.
    Strategy: In each iteration, exclude either the left or right subtree from the rest of the search, depending on whether the target key is less than or greater than the key at the current vertex.
    Remark: An apology is due to the computer scientists because the versions below blur the distinction between a variable and a pointer to that variable. For instance, the variable T is typically a pointer to the root of the binary tree that is called T. Here, instead, root(T) is used to refer to the root of the binary tree T
  • Hands-On Data Structures and Algorithms with Python
    eBook - ePub

    Hands-On Data Structures and Algorithms with Python

    Write complex and powerful code using the latest features of Python 3.7, 2nd Edition

    • Dr. Basant Agarwal, Benjamin Baka(Authors)
    • 2018(Publication Date)
    • Packt Publishing
      (Publisher)
    9 }, as shown in the following diagram, then the tree would not be more efficient than the list:
    Thus, choosing a self-balancing tree helps to improve the search operation. Here, we should note t hat the binary search tree is a better choice in most of the cases; however, we should try to balance the tree.
    Passage contains an image

    Balancing trees

    We have seen in the previous section that if nodes are inserted into a tree in a sequential order, it becomes slow and behaves more or less like a list;
    that is, each node has exactly one child node. To improve the performance of the tree data structure, we generally like to reduce the height of the tree as much as possible to balance the tree by filling up each row in the tree. This process is called balancing the tree .
    There are different types of self-balancing trees, such as red-black trees, AA trees, and scapegoat trees. These balance the tree during each operation that modifies the tree, such as insert or delete. There are also external algorithms that balance a tree. The benefits of these are that you don't need to balance the tree on every single operation and can leave balancing to the point where you need it.
    Passage contains an image

    Expression trees

    An arithmetic expression is represented by a combination of operators and operands where the operators can be unary or binary. An arithmetic expression can also be represented using a binary tree
  • Advanced Data Structures
    eBook - ePub

    Advanced Data Structures

    Theory and Applications

    T.
    Definition 8.4.1 Let a and b be integers with a ≥ 2 and 2a − 1 ≤ b. A tree T is an (a, b)-tree if
    1.All leaves of T have the same depth.
    2.All nodes v of T satisfy p(v) ≤ b.
    3.All nodes v except the root satisfy p(v) ≥ a.
    4.The root r can have one key.
    The B Tree described in the section above is a special case of an (a,b) tree with b = 2a-1, i.e., B Tree is a (a, 2a-1) tree.
    The number of leaves in an (a,b) tree is logarithmic. Insertion and deletion into (a,b)-trees is quite similar to the corresponding operations in B Trees. An insertion means the addition of a new leaf at a given position in the tree, a deletion means the pruning of an existing leaf at a given position in the tree. Note that we treat the searches for these positions separately in what follows, i.e., for the moment we concentrate at the re-balancing aspect of (a, b)-trees.
    8.4.1   Insertion
    An insertion is accomplished by a sequence of node expansions and node splittings, terminating in a balanced (a, b)-tree. Let w be any leaf of T and suppose that a new leaf is to be inserted to the right (left) of w. Let v be the parent of w. Expand v, i.e., make the new leaf an additional child of v. The expansion of v increases p(v) by i. If p(v) is still <b then re-balancing is complete. Otherwise v needs to be split. Since splitting may propagate we formulate it as a loop.
    8.4.2   Deletion
    A deletion is accomplished by a sequence of node shrinking and node fusing possibly followed by one node sharing. Deletion has two parameters, the sharing threshold t, which specifies when to share or fuse and the shifting parameter s, which specifies the number of children to shift when sharing. Let w be any leaf of T (the leaf to be deleted) and let v be the parent of w, in first step we shrink v by means of pruning the w. This decreases the value of p(v) by 1. If p(v) is still > a or the height of v is 1 then re-balancing is completed, because we represent the empty tree by a single node. Otherwise, v needs to be rebalanced by either fusing or sharing. Let u be any sibling of v, when p(v) = a − 1 and p(u) = a + j during deletion, the algorithm performs a node fusing if j < t, otherwise a sharing will take place based on the value of s
  • Data Structures Through C++
    eBook - ePub

    Data Structures Through C++

    Experience Data Structures C++ through animations

    Chapter 07

    Trees

    Of Herbs, Shrubs and Bushes

    Why This Chapter Matters?

    Nature is man's best teacher. In every walk of life man has explored nature, learnt his lessons and then applied the knowledge that nature offered him to solve every-day problems that he faced at work- place. It isn't without reason that there are data structures like Trees, Binary Trees, Search Trees, AVL Trees, Forests, etc. Trees are non-linear data structures. They have many applications in Computer Science, hence you must understand them comprehensively.
     
    I f large input data is stored in a linked list then time required to access the data is prohibitive. In such cases a data structure called Tree is used. This data structure is often used in constructing the file systems and evaluation of arithmetic expressions. This data structure gives a running time of O (log n) for most operations.
    Like linked lists, a tree also consists of several nodes. Each node may contain links that point to other nodes in the tree. So a tree can be used to represent a person and all of his descendants as shown in Figure 7-1 .
    Figure 7-1. A tree structure .
    Note that each node in this tree contains a name for data and one or more pointers to the other tree nodes. Although a tree may contain any number of pointers to the other tree nodes, a large number of have at the most two pointers to the other tree nodes. Such trees are called Binary trees .

    Binary Trees

    Let us begin our study of binary trees by discussing some basic concepts and terminology.
    A binary tree is a finite set of elements that is either empty or is partitioned into three disjoint sub-sets. The first sub-set contains a single element called the root of the tree. The other two sub-sets are themselves binary trees, called the left and right sub-trees of the original tree. A left or right sub-tree can be empty.
    Each element of a binary tree is called a node of the tree. The tree shown in Figure 7-2(a) consists of nine nodes with A as its root. Its left sub-tree is rooted at B and its right sub-tree is rooted at C . This is indicated by the two branches emanating from A to B on the left and to C on the right. The absence of a branch indicates an empty sub-tree. For example, the left sub-tree of the binary tree rooted at C and the right sub-tree of the binary tree rooted at E are both empty. The binary trees rooted at D , G , H and I
  • Data Structures Through C
    eBook - ePub

    Data Structures Through C

    Learn the fundamentals of Data Structures through C

    Chapter 07

    Trees

    Of Herbs, Shrubs and Bushes

    Why This Chapter Matters?

    Nature is man’s best teacher. In every walk of life man has explored nature, learnt his lessons and then applied the knowledge that nature offered him to solve every-day problems that he faced at work- place. It isn’t without reason that there are data structures like Trees, Binary Trees, Search Trees, AVL Trees, Forests, etc. Trees are non-linear data structures. They have many applications in Computer Science, hence you must understand them comprehensively.
     
    I f large input data is stored in a linked list then time required to access the data is prohibitive. In such cases a data structure called Tree is used. This data structure is often used in constructing the file systems and evaluation of arithmetic expressions. This data structure gives a running time of O (log n) for most operations.
    Like linked lists, a tree also consists of several nodes. Each node may contain links that point to other nodes in the tree. So a tree can be used to represent a person and all of his descendants as shown in Figure 7-1 .
    Figure 7-1. A tree structure .
    Note that each node in this tree contains a name for data and one or more pointers to the other tree nodes. Although a tree may contain any number of pointers to the other tree nodes, a large number of have at the most two pointers to the other tree nodes. Such trees are called Binary trees .

    Binary Trees

    Let us begin our study of binary trees by discussing some basic concepts and terminology.
    A binary tree is a finite set of elements that is either empty or is partitioned into three disjoint sub-sets. The first sub-set contains a single element called the root of the tree. The other two sub-sets are themselves binary trees, called the left and right sub-trees of the original tree. A left or right sub-tree can be empty.
    Each element of a binary tree is called a node of the tree. The tree shown in Figure 7-2(a) consists of nine nodes with A as its root. Its left sub-tree is rooted at B and its right sub-tree is rooted at C . This is indicated by the two branches emanating from A to B on the left and to C on the right. The absence of a branch indicates an empty sub-tree. For example, the left sub-tree of the binary tree rooted at C and the right sub-tree of the binary tree rooted at E are both empty. The binary trees rooted at D , G , H and I
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.