pyrplib package



pyrplib.artificial module

pyrplib.artificial.addmossimple(D, start_index, end_index)

For a binary matrix D, create simple multiple optimal solutions in the range of teams specified. Indices are inclusive.

pyrplib.artificial.addnoise(D, percentnoise, low=0, high=1)


Function replaces random off diagonal elements in D with values from low to high

pyrplib.artificial.create_dataset(create_func, options)

Create a dataset using a create function and a function to generate the options used.

See example create_func and get_create_options_func

pyrplib.artificial.create_dataset_manual(D_matrices, options, create_code='manual')

Create a dataset by manually passing the D matrices as a list. The options are not used in any way. They are here if you want to include them.


Create a simple cycle D matrix of size n x n.

pyrplib.artificial.domfromranking(n, r, ngames, upset_func=<function <lambda>>)

DOM matrix from ranking

Simulates win/loss of individual games using the ranking vector (r) and the upset function. The upset function must take two rankings r1 and r2. r1 > r2. This function must return True/False depending on whether an upset occurred.

pyrplib.artificial.domplusnoise(n, percentnoise, low=0, high=1)

function creates a dominance graph and adds noise.

Input: n = number of rows/cols in D matrix
percentnoise = integer between 1 and n^2 representing the

percentage of noise to add to D domgraph, e.g., if percentnoise = 10, then 10% of the n^2 elements will be noise

Example: ‘D = domplusnoise(6,20)’ creates a 6 by 6 matrix with 20% noise

added to the dominance graph

pyrplib.artificial.emptyplusnoise(n, percentnoise, low=0, high=3)


Function starts with an empty graph and adds some amount of noise.

Input: n = number of rows/cols in D matrix percentnoise = integer between 1 and n^2 representing the percentage of noise to add to D hillside, e.g., if percentnoise = 10, then 10% of the n^2 elements will be noise

Example: ‘D = emptyplusnoise(6,20)’ creates a 6 by 6 matrix with 20% noise added to the empty graph

pyrplib.artificial.example_create(options={'num_games': 1000, 'number_matrices': 10, 'number_of_rows_columns': 20, 'threshold': 3})

Example create function. These functions must return a dominance (D) matrix that is a pandas dataframe. Options is a dictionary. There is one required key/value which is the number_of_rows_columns. It may also have additional arguments.

pyrplib.artificial.example_create2(options={'num_games': 1000, 'number_matrices': 10, 'number_of_rows_columns': 20})

Example create function. These functions must return a dominance (D) matrix that is a pandas dataframe. Options is a dictionary. There is one required key/value which is the number_of_rows_columns. It may also have additional arguments.


Example create function. These functions must return a dominance (D) matrix that is a pandas dataframe. Options is a dictionary. There is one required key/value which is the number_of_rows_columns. It may also have additional arguments.


Example set of options to be paired with example_create function.


Example set of options to be paired with example_create2 function.

pyrplib.artificial.hillsideplusnoise(n, percentnoise, low=1, high=5)


Starts with a perfect hillside graph and then randomly perturbs the matrix at user specified percentage.

Input: n = number of rows/cols in D matrix
percentnoise = integer between 1 and n^2 representing the

percentage of noise to add to D hillside, e.g., if percentnoise = 10, then 10% of the n^2 elements will be noise

Example: ‘D = hillsideplusnoise(6,20)’ creates a 6 by 6 matrix with 20% noise

added to the hillside graph


Function returns a modified version of D with percent of nonzero links removed



Function returns an unweighted version of D

pyrplib.artificial.weakdomplusnoise(n, percentnoise, low=0, high=1)

function creates a weak dominance graph and adds noise.

Input: n = number of rows/cols in D matrix
percentnoise = integer between 1 and n^2 representing the

percentage of noise to add to D domgraph, e.g., if percentnoise = 10, then 10% of the n^2 elements will be noise

Example: ‘D = weakdomplusnoise(6,20)’ creates a 6 by 6 matrix with 20% noise

added to the dominance graph

pyrplib.base module

class pyrplib.base.DInfo

Bases: object

A class to represent information about a dominance (D) matrix.

property D
property D_type
property command
property dataset_id
static from_json(file)

Static method that reads a DInfo object from a JSON file.


Returns a DInfo object

Return type


property source_dataset_id

Returns a JSON string representing the object.


Returns a JSON string representing the object.

Return type


class pyrplib.base.HillsideCard

Bases: LOPCard

class pyrplib.base.LOPCard

Bases: object

A class that represents the analysis, results, and metrics associated with running LOP algorithm.

LOPCard can be saved as a JSON file that contains the following:

    "D": "<Dominance matrix and input to the LOP solver>",
    "obj": "<Optimal value of LOP>",
    "solutions": "<List of optimal orderings/permutations that result in an optimal value>",
    "max_tau_solutions": "<Two farthest orderings/permutations measured by Kendall tau (when available)>",
    "centroid_x": "<X*>",
    "outlier_solution": "<Optimal ordering/permutation that is farthest from centroid_x>",
    "dataset_id": "<Identifying ID>"
property D

Adds a solution specified by a permutation/ordering.


[sol] – [A permutation/ordering of type list or tuple]

property centroid_solution
property centroid_x
property dataset_id
static from_json(file)

Static method that reads a LOPCard object from a JSON file.


Returns a LOPCard object

Return type


property obj
property outlier_solution
property solutions
property source_dataset_id

Returns a JSON string representing the object.


Returns a JSON string representing the object.

Return type


class pyrplib.base.MatricesInfo

Bases: object

A class to represent information about matrices M and b. i.e., MX=b

property b
property command
property dataset_id
static from_json(file)

Static method that reads a MatricesInfo object from a JSON file.


Returns a MatricesInfo object

Return type


property matrix
property source_dataset_id

Returns a JSON string representing the object.


Returns a JSON string representing the object.

Return type


pyrplib.card module

class pyrplib.card.Card

Bases: ABC

The base Card abstract class.

property dataset_id
static get_contents(file)

Static method that reads a Card from a JSON file.


[file] ([str]) – [file path or URL path to JSON file]


Returns a Pandas Series object

Return type


load(dataset_id, options)

Load a Card using the dataset_id and the options.

  • [dataset_id] – [Dataset ID]

  • [options] ([dict]) – [Dictionary of options]

property options
abstract prepare(processed_dataset)
abstract run()
property source_dataset_id

Returns a JSON string representing the object.


Returns a JSON string representing the object.

Return type


abstract view()
class pyrplib.card.Hillside

Bases: LOP

A class that represents the analysis, results, and metrics associated with running Hillside algorithm.

Hillside finds the optimal solution in Hillside form:

Chartier, Timothy P., et al. “Minimum violations sports ranking using evolutionary optimization and binary integer linear program approaches.” Proceedings of the Tenth Australian Conference on Mathematics and Computers in Sport, A. Bedford and M. Ovens, eds., MathSport (ANZIAM), New South Wales, Australia. 2010.

Hillside and LOP share the same metrics and analysis.

class pyrplib.card.LOP

Bases: Card

A class that represents the analysis, results, and metrics associated with running LOP algorithm.

LOP card can be saved as a JSON file that contains the following:

    "dataset_id": "<Identifying Dataset ID>"
    "source_dataset_id": "<Identifying Source Dataset ID>"
    "D": "<Dominance matrix and input to the LOP solver>",
    "obj": "<Optimal value of LOP>",
    "solutions": "<List of optimal orderings/permutations that result in an optimal value>",
    "farthest_pair": "<Two farthest orderings/permutations measured by Kendall tau (when available)>",
    "tau_farthest_pair": "<Associated Kendall tau value (when available)>",
    "closest_pair": "<Two (not identical) closest orderings/permutations measured by Kendall tau (when available)>",
    "tau_closest_pair": "<Associated Kendall tau value (when available)>",
    "centroid_x": "<X*>",
    "outlier_solution": "<Optimal ordering/permutation that is farthest from centroid_x>",
    "method": "<Method which is LOP or Hillside>"
property D

Adds a solution specified by a permutation/ordering.


[sol] – [A permutation/ordering of type list or tuple]

property beta
property centroid_solution
property centroid_x
property closest_pair
property farthest_pair
static from_json(file_link)

Static method that reads a LOP card object from a JSON file.


Returns a LOP card object

Return type



Returns a diciontary with both dash and notebook ready visualization.


Dash and notebook visuals

Return type


property method
property obj
property outlier_solution

Prepare the data for analysis. For LOP this means filling in missing values in the dominance matrix and removing rows and columns with all 0’s.


[processed_dataset] ([dataset.Processed]) – [Processed dataset object]



Return type


property r

Returns a rating vector using X*.


Rating vector derived from X*

Return type



Run the LOP analysis and compute the metrics.

property solutions
property tau_closest_pair
property tau_farthest_pair

Returns a dictionary in dash ready format.


List of HTML dash ready objects

Return type


property xstar

Return X* as a dataframe using the row and column names of D.

property xstar_r_r

Return X* optimally reordered.

class pyrplib.card.SystemOfEquations(method)

Bases: Card

A class that represents the analysis, results, and metrics associated with solving a system of equations to produce a ranking.

SystemOfEquations card can be saved as a JSON file that contains the following:

    "dataset_id": "<Identifying Dataset ID>"
    "source_dataset_id": "<Identifying Source Dataset ID>"
    "M": "<Matrix from Mx=b>",
    "b": "<Vector from Mx=b>",
    "r": "<Rating vector>",
    "ranking": "<Ranking vector>",
    "perm": "<Ordering/permutation>",
    "options": "<dictionary of options>",
    "games": "<Games (or more generally matchups) that are processed to produce M and b>",
    "teams": "<List of teams (or more generally items)>",
    "method": "<Method which is Massey or Colley>"
property M
property b
static from_json(file_link)

Static method that reads a SystemOfEquations card object from a JSON file.


Returns a SystemOfEquations card object

Return type


property games
property method
property perm

Prepare the data for analysis.


[processed_dataset] ([dataset.Processed]) – [Processed dataset object]



Return type


property r
property ranking

Solve the system of equations and store the results.

property teams

Returns a dictionary in dash ready format.


List of HTML dash ready objects

Return type

list module


Bases: object

A class that facilitates accessing the datasets for RPLIB.

This class reads the following TSV files:
  • {DATA_PREFIX}/unprocessed_datasets.tsv
    • Columns: Dataset ID, Dataset Name, Description, Type, Loader, Download links

    • Dataset ID - persistant unique ID for each dataset

    • Dataset Name - Short human readable name for the dataset

    • Description - Longer human readable description of the dataset

    • Type - Games|D matrix|Features|Structured Artificial

    • Loader - Class that is used to load the dataset (e.g., marchmadness.base.Unprocessed)

    • Download links - String of comma separated file links

  • {DATA_PREFIX}/processed_datasets.tsv
    • Columns: Dataset ID, Source Dataset ID, Index, Command, Type, Collection, Options, Last Processed Datetime, Identifier

    • Dataset ID - persistant unique ID for each processed dataset

    • Source Dataset ID - source dataset ID

    • Index - Index pointing into the source dataset to extract the specific value

    • Command - Python functional code statement describing how to process the data. May assume the following variables: data and index.

    • Type - resulting type of dataset (D|Games)

    • Collection - Name of collection for organization in the data directory

    • Options - JSON string of optional options

    • Last Processed Datetime - Last time this dataset was processed/updated

    • Identifier - Optional identifying string for the dataset

  • {DATA_PREFIX}/lop_cards.tsv
    • Columns: Dataset ID, Processed Dataset ID, Options, Last Processed Datetime

    • Dataset ID - persistant unique ID for each card

    • Processed Dataset ID - processed dataset ID used as input

    • Options - JSON string of optional options

    • Last Processed Datetime - Last time this dataset was processed/updated

  • {DATA_PREFIX}/hillside_cards.tsv
    • Columns: Dataset ID, Processed Dataset ID, Options, Last Processed Datetime

    • Dataset ID - persistant unique ID for each card

    • Processed Dataset ID - processed dataset ID used as input

    • Options - JSON string of optional options

    • Last Processed Datetime - Last time this dataset was processed/updated

  • {DATA_PREFIX}/massey_cards.tsv
    • Columns: Dataset ID, Processed Dataset ID, Options, Last Processed Datetime

    • Dataset ID - persistant unique ID for each card

    • Processed Dataset ID - processed dataset ID used as input

    • Options - JSON string of optional options

    • Last Processed Datetime - Last time this dataset was processed/updated

  • {DATA_PREFIX}/colley_cards.tsv
    • Columns: Dataset ID, Processed Dataset ID, Options, Last Processed Datetime

    • Dataset ID - persistant unique ID for each card

    • Processed Dataset ID - processed dataset ID used as input

    • Options - JSON string of optional options

    • Last Processed Datetime - Last time this dataset was processed/updated

load_card(dataset_id, card_type)

pyrplib.dataset module

class pyrplib.dataset.Processed

Bases: Unprocessed

Processed dataset labeled with a persistant and unique dataset_id

property command

Returns dash ready data

property data

Returns a dataframe

property dataset_id
abstract static from_json(file)
property short_type
abstract size_str()
property source_dataset_id
property type

Return the high level type of an element in data() as a string

class pyrplib.dataset.ProcessedD

Bases: Processed

Processed dominance (D) dataset object

static from_json(file)

Loads a ProcessedD file from a JSON file.


[file] – [Path to local or http JSON file]


Returns a ProcessedD object.

Return type



Load a processed dominance (D) dataset with options


Size of dataset as a string

class pyrplib.dataset.ProcessedGames

Bases: Processed

Processed games dataset object

static from_json(file)

Loads a ProcessedGames file from a JSON file.


[file] – [Path to local or http JSON file]


Returns a ProcessedGames object.

Return type



Load a processed games dataset with options


Size of dataset as a string

class pyrplib.dataset.Unprocessed(dataset_id, links)

Bases: ABC

Unprocessed dataset labeled with a persistant and unique dataset_id

abstract dash_ready_data()

Returns dash ready data


Returns a dataframe

abstract load(options={})

Code that loads the data from the links

abstract type()

Return the high level type of an element in data() as a string


Standard view function for a dataset


Standard view function for an item from a dataset

class pyrplib.dataset.UnprocessedType(value)

Bases: Enum

An enumeration.

D = 0
Features = 2
Games = 1
pyrplib.dataset.load_unprocessed(unprocessed_source_id, datasets_df)

Helper function to load unprocessed dataset.

  • [unprocessed_source_id] – [Unprocessed dataset ID]

  • [datasets_df] – [Dataframe of datasets read from data.Data(DATA_PREFIX)]


Unprocessed dataset

Return type

dataset.Unprocessed module, id)

Returns a dash data table with standard configuration., download_id, progress_id=None, collapse_id=None)

Return a standard download button., id)

Helper function to view a single item.

pyrplib.transformers module

class pyrplib.transformers.ColumnCountTransformer(columns)

Bases: BaseEstimator, TransformerMixin

A class to convert a feature matrix to a dominance matrix in the standard sklearn transformer paradigm.

fit(X, y=None)
transform(X, y=None)
class pyrplib.transformers.ComputeDTransformer(direct_thres=0, spread_thres=0, team_range=None)

Bases: BaseEstimator, TransformerMixin

A class to convert games to a dominance matrix in the standard sklearn transformer paradigm.

fit(X, y=None)
transform(X, y=None)
pyrplib.transformers.count(games, teams)

Returns a processed direct matchup dominance matrix, processed indirect matchup dominance matrix, and the transformer used.

  • [games] ([pandas.DataFrame]) – [DataFrame of games (matchups between items)]

  • [teams] ([list]) – [list of teams/items]


Tuple of processed D from direct matchups, processed D from indirect matchups, and the transformer

Return type

tuple, ID, trans)

Returns the direct matchup (D) matrix from the arguments.

pyrplib.transformers.directplusindirect(D, ID, trans, indirect_weight=1.0)

Returns a processed D object that is a combination of D and ID using the indirect weight.


Processed dominance matrix that is a weighted combination of D and ID

Return type


pyrplib.transformers.features_to_D(df_features, options={})

Convert a features matrix to a dominance matrix.

options[“columns”] = list of columns you would like to convert options[“items”] = list of items you would like to use. Items must be in the index

pyrplib.transformers.indirect(D, ID, trans)

Returns the indirect matchup (ID) matrix from the arguments.


Returns a processed D object from a dominance matrix (pandas.DataFrame).

pyrplib.transformers.standardize_games_teams(games, teams, options={})

Returns a standardized version of games and teams with the expected column names as a ProcessedGames object.

options[“team1_name”] = column in your dataframe that has team 1 names options[“team2_name”] = column in your dataframe that has team 2 names options[“team1_score”] = column in your dataframe that has team 1 score options[“team2_score”] = column in your dataframe that has team 2 score options[“team1_H_A_N”] = column in your dataframe to specifies home = 1, away = -1, or neutral = 0 for team 1 options[“team2_H_A_N”] = column in your dataframe to specifies home = 1, away = -1, or neutral = 0 for team 2

Module contents