An Introduction to KNIME Analytics Platform

Dr. Lochana C. Menikarachchi

What is KNIME?

  • KNIME is an open-source data analytics tool
  • Written in Java with a visual interface
  • Offers both open-source and commercial products
  • Suitable for beginners and advanced users

Gartner Magic Quadrant Leader 2014-2018

KNIME Ecosystem

KNIME Wokspaces

  • KNIME wokspaces are folders where preferences and workflows are saved
  • If a workspace folder does not exist, it will be automatically created when selected
  • Can be local folders or spaces in a remote location like KNIME hub
  • Workspaces can be switched through KNIME workbench
  • Different workspaces can be used for different projects to keep them clean and tidy

KNIME Hubs

  • A collaborative platform that enables users to share, deploy, and manage analytical solutions
  • Avaiable as a free community hub and customer managed commercial hubs
  • Community hub is ideal for open collaboration among a wide audience of users
  • Commercial hubs provide a more controlled and customizable environment tailored to organizational needs

Community Hub vs Commercial Hubs

Feature Community Hub Commercial Hubs
License Open-source, free Paid licenses
Access Public Customer-managed
Collaboration Robust collaboration features Real-time collaboration, version control
Workflows Sharing Public sharing Public and private sharing
Secure Access Private spaces User authentication, roles-based permissions
Workflow Automation Supports automation Advanced scheduling, external integration
Customization Open-source; some customization Limited customization; paid add-ons
Community Engagement Engage on forum Forums, support channels

KNIME Spaces

  • KNIME spaces are virtual folders within KNIME hubs
  • Spaces can be private or public areas
  • Private spaces are accessible to the owner or anyone who obtained the permission from the space owner (this requires a paid team plan)
  • Public spaces allow anyone with access to view and use content
  • Spaces enhance project management and team collaboration

KNIME Spaces

  • KNIME hub spaces are accessible from the KNIME workbench
  • By default, one public space and one private space are available
  • The KNIME workbench allows the creation of new spaces
  • All newly created spaces are private by default
  • The KNIME workbench does not allow the deletion of workspaces or changing their visibility to public

KNIME Spaces

  • Change of workspace visibility and deletion can be done using a browser login (https://hub.knime.com/ user_name)

KNIME Workflows

  • A workflow is an analysis flow consisting of a sequence of nodes that represent data analysis steps
  • Nodes are the single processing units in a workflow, implementing and executing specific analysis steps graphically

KNIME Workflows

  • Workflows can include operations like reading data, cleaning data, filtering data, and writing processed data into files
  • KNIME workflows are represented as graphs (direct acyclic graphs) connecting nodes
  • Some basic workflows can be found under Example Workflows directory

KNIME Nodes

  • Nodes represent individual tasks in an analysis
  • Each node is displayed as a colored box with input and output ports

KNIME Nodes

  • The input is the data that a node receives via the input port
  • The output is the resulting data sent from the output port
  • Each node has specific settings that can be adjusted in a configuration dialog

Connecting Nodes

  • Only ports of the same type indicated by the same color can be connected

Node Traffic Lights

  • Each node’s state is indicated by a traffic light

Node States

Node State Traffic Light Color
Inactive and not yet configured Red
Configured but not yet executed Yellow
Executed successfully Green
Error Red cross
Warning Yellow triangle with an exclamation mark

KNIME Files

  • KNIME workflows can be packaged and exported in .knwf or .knar files
  • A .knwf file contains only one workflow
  • A .knar file contains a group of workflows

KNIME Entry Page

KNIME UI (Modern)

KNIME UI (Classic)

  • User interface can be switched by clicking MenuSwitch to classic user interface

KNIME Extensions

  • Extensions are open source components that add additional functionalities
  • Enable access to and processing of complex data types
  • Include advanced algorithms and custom capabilities
  • Enhance the functionality of the KNIME Analytics Platform

Installing Extensions

Workflow Annotations

  • Annotations can be added by right-clicking the workflow editor
  • Serve as documentation for complex workflows
  • Enable better understanding

Meta Nodes and Components

  • Metanodes and components aim to organize messy workflows
  • Improve readability by isolating logical operations
  • Reduce number of visible nodes for neater appearance
  • Encapsulate blocks of operations within metanodes/components

Meta Nodes

  • In a metanode, all flow variables come in from the parent workflow and all flow variables created within the component go out into the workflow
  • Metanodes cannot have configuration windows
  • Metanodes cannot have views

Components

  • Flow variables created inside components stay within unless set in Component Output node
  • External workflow flow variables cannot enter component without explicit setting in Component Input node
  • Components can have configuration windows
  • Components can have views

Metanodes vs. Components

Feature Meta Node Component
Configuration settings X
Chart and plot view X
Isolate flow variables X
Clean up messy workflows

Metanode or Component?

  • Node with configuration settings? → a component
  • Node producing a composite view? → a component
  • To get rid of a flow variable? → a component
  • To clean up a workflow? → a metanode

KNIME - R Integration

  1. Install base R and the R library RServe
  2. Search and Install extensions KNIME R Scripting and KNIME R Integration
  3. Set the R path in KNIME preferences
  4. Create and execute R scripts in KNIME

Setting R Path in KNIME

Back to Slide Decks