Google Summer of Code 2008

Google Summer of Code is upon us again! In previous years we have had some great applications and ideas submitted by new and experienced developers. This is a list of ideas for projects submitted by OSCAR developers.

Project Name : oscar-ipmi

  • Keywords: ipmi, ha, perl, c++, gui, cli, rrdtool, cacti
  • Problem Summary: Many of the newer cluster nodes have ipmi management
    features on them, which can be used to reboot nodes, get environmental
    statistics, and other management tasks. Currently OSCAR does not do
    anything with these cards.
  • Project Desctiption: Add to the existing GUI and system a way to collect
    IPMI mac addresses, add them to DHCP, and build some wrappers around
    impitool to enable nodes to be automatically rebooted if they crash.
    Interact with the scheduler to see if they are down, or use ssh, or
    possibly linux-ha to determine node status. Graphing environmental data
    from the nodes would be beneficial, as well as collecting information
    about failed/failing components.
    Project includes:

    • Adding to the OSCAR database with MAC Addresses for IPMI cards
    • Adding to the Installation wizard a way to collect and pair them to
      hosts.
    • Possibly a test to ensure that you are talking to the correct node
    • Adding a gui and cli to gather information from cluster
    • Adding a gui and cli to reboot nodes (interface with netbootmgr to
      force reinstalls)
    • Adding a gui and cli (cronned) to test for hardware failures on nodes
    • Documentation

Project name: Configurator extension

  • Keywords: Perl, Configurator, GUI, CLI, XML, XHTML, XSL-T
  • Bug #: 459
  • Problem Summary: Configurator currently uses HTML document to describe the
    configuration page and options. Unfortunately, while HTML can easily be
    displayed with GUI toolkits, it is quite difficult to parse and display HTML
    documents in command line based tools. This is one of the major drawbacks for
    the implementation of a fully featured command line interface.

  • Project Description: Specify and implement a solution for Configurator based on
    XML or XHTML. In order to not break the existing code, the current code of the
    GUI will still be based on HTML. If needed, a solution may be to use XSL*T
    documents to "translate" XML Configurator documents into HTML documents.
    Therefore, the project includes:

      • Choice between XML or XHTML. One of the main question is to now if XHTML can
        be, or not, be used into the current GUI.

      • If XML is used, implementation of XSL
      • T document for the creation of the
        HTML files needed by the current GUI. Modification of the OSCAR Makefile for
        the automatic generation of the HTML Configurator documents.

      • "Translation" of current HTML files into XML/XHTML.
      • Extend the current CLI to support Configuration in command line mode.
      • Include the Configurator support into the new command line tool (oscar
        script).

      • Write a full documentation.

Project name: xoscar

  • Keywords: C++, Qt4, CLI
  • Problem Summary: a set of new tools are currently under development with
    the goal to support virtualization and disk-less solution, but also to provide a
    new GUI based on modern technical solution. For that, the OSCAR team started to
    develop a C++/Qt4 GUI. However, this GUI still does not support all the
    capabilities provided by the new OSCAR tools (the new GUI is based on a new
    CLI).

  • Problem Description: The GUI has to be finalized and extended in order to
    support virtualization (via V2M/OSCAR-V). A description of few tasks is available here: http://svn.oscar.openclustergroup.org/trac/oscar/browser/pkgsrc/xoscar/TODO

Project name: V2M extension

  • Keywords: virtualization, C++, KVM, lguest, VMWare
  • Bug#:
  • Problem Summary: Release new version of V2M, including the support of new
    virtualization solutions such as lguest, KVM, and VMWare.

  • Problem Description:
    • Release V2M v1.0 (see current roadmap).
    • Release V2M v1.1 (see current roadmap).
    • Include the support of lguest, KVM in priority and optionally VMWare. Each
      time a new solution is supported, release a new version of V2M. Note that
      the exact list of supported features will have to be decided, as the way to
      validate these features.
  • V2M website: http://www.csm.ornl.gov/srt/v2m.html

Project name: Extension of SSI-OSCAR: implementation of a high performance file
system for clusters.

  • Keywords: C, Linux kernel, kernel modules, single system image, SSI-OSCAR, Kerrighed.
  • Problem Summary: File systems commonly used today in clusters are based on a
    compute nodes/storage nodes paradigm. Computations do not take advantage of
    CPUs on storage nodes and the distributed file system used in the cluster does
    not use the compute nodes disks. Even if the compute/storage nodes are the same,
    the file system is not integrated in the cluster services and cannot take
    advantage of those in order to improve performance.

  • Project Description: kDFS, a kernel Distributed File System, aims at providing
    a high performance file system for clusters. Based on a fully symmetric model,
    kDFS is currently implemented in the framework of the Kerrighed open-source
    project, a single system image used in SSI-OSCAR (http://www.kerrighed.org,
    http://ssi-oscar.gforge.inria.fr).

    The current version of kDFS provides distributed caches for both metadata and
    data stored on native FS.

    The main idea of kDFS is to take advantage of available cluster services in
    order to improve performance and availability. As an example, process migration
    mechanisms can lead to a smaller network impact by running the computations on
    nodes where persistent data is available.

    Moreover, the coordination between such an integrated file system and the
    cluster resource management system (traditional batch scheduler or SSI load
    balancing mechanisms) could also contribute to better performance.
    Due to the larger size of clusters, storage heterogeneity (e.g. because of
    replacement of faulty nodes by nodes with bigger disks) and dynamicity become
    important parameters which have to be taken into account by the system
    developers.

    The goal of this GSoC project is to implement striping & replication policies in
    kDFS in order to handle these parameters. The work consists in two steps. First,
    implementing required mechanisms in order to handle human addition/removal of
    nodes in a cluster. Second, take benefit of dynamical RAID strategies to
    continue to improve global cluster efficiency.
    According to time constraints, the student could also study fault tolerance
    aspects in the context of kDFS usage.

Seeds of Ideas!
These are some other great ideas submitted that need some additional fleshing out before they're ready for action. That fleshing out could be done by you!

  1. disaster-tolerant or disaster-ready features for OSCAR. It may be centered around DRBD, HA, OSCAR integration.
  2. Clearspeed ready or GPU ready cluster (Clearspeed SW stack/CUDA + MPI integration).
  3. To wrapper an OFED installation into oscar, so that it can build the OFED (OpenFrameworks for IB) package easily, and integrate it into SGE/Torque. (IB hardware may be available for testing by arangement)