pyrplib package¶

Submodules¶

pyrplib.artificial module¶

pyrplib.artificial.addmossimple(D, start_index, end_index)¶: For a binary matrix D, create simple multiple optimal solutions in the range of teams specified. Indices are inclusive.

pyrplib.artificial.addnoise(D, percentnoise, low=0, high=1)¶

ADD NOISE

Function replaces random off diagonal elements in D with values from low to high

pyrplib.artificial.create_dataset(create_func, options)¶

Create a dataset using a create function and a function to generate the options used.

See example create_func and get_create_options_func

pyrplib.artificial.create_dataset_manual(D_matrices, options, create_code='manual')¶: Create a dataset by manually passing the D matrices as a list. The options are not used in any way. They are here if you want to include them.

pyrplib.artificial.cyclic(n)¶: Create a simple cycle D matrix of size n x n.

pyrplib.artificial.domfromranking(n, r, ngames, upset_func=<function <lambda>>)¶

DOM matrix from ranking

Simulates win/loss of individual games using the ranking vector (r) and the upset function. The upset function must take two rankings r1 and r2. r1 > r2. This function must return True/False depending on whether an upset occurred.

pyrplib.artificial.domplusnoise(n, percentnoise, low=0, high=1)¶

function creates a dominance graph and adds noise.

Input: n = number of rows/cols in D matrix

percentnoise = integer between 1 and n^2 representing the: percentage of noise to add to D domgraph, e.g., if percentnoise = 10, then 10% of the n^2 elements will be noise

Example: ‘D = domplusnoise(6,20)’ creates a 6 by 6 matrix with 20% noise

added to the dominance graph

pyrplib.artificial.emptyplusnoise(n, percentnoise, low=0, high=3)¶

EMPTY + NOISE

Function starts with an empty graph and adds some amount of noise.

Input: n = number of rows/cols in D matrix percentnoise = integer between 1 and n^2 representing the percentage of noise to add to D hillside, e.g., if percentnoise = 10, then 10% of the n^2 elements will be noise

Example: ‘D = emptyplusnoise(6,20)’ creates a 6 by 6 matrix with 20% noise added to the empty graph

pyrplib.artificial.example_create(options={'num_games': 1000, 'number_matrices': 10, 'number_of_rows_columns': 20, 'threshold': 3})¶: Example create function. These functions must return a dominance (D) matrix that is a pandas dataframe. Options is a dictionary. There is one required key/value which is the number_of_rows_columns. It may also have additional arguments.

pyrplib.artificial.example_create2(options={'num_games': 1000, 'number_matrices': 10, 'number_of_rows_columns': 20})¶: Example create function. These functions must return a dominance (D) matrix that is a pandas dataframe. Options is a dictionary. There is one required key/value which is the number_of_rows_columns. It may also have additional arguments.

pyrplib.artificial.example_create3(options)¶: Example create function. These functions must return a dominance (D) matrix that is a pandas dataframe. Options is a dictionary. There is one required key/value which is the number_of_rows_columns. It may also have additional arguments.

pyrplib.artificial.example_get_create_options()¶: Example set of options to be paired with example_create function.

pyrplib.artificial.example_get_create_options2()¶: Example set of options to be paired with example_create2 function.

pyrplib.artificial.example_get_create_options3()¶

pyrplib.artificial.hillsideplusnoise(n, percentnoise, low=1, high=5)¶

HILLSIDE + NOISE

Starts with a perfect hillside graph and then randomly perturbs the matrix at user specified percentage.

Input: n = number of rows/cols in D matrix

percentnoise = integer between 1 and n^2 representing the: percentage of noise to add to D hillside, e.g., if percentnoise = 10, then 10% of the n^2 elements will be noise

Example: ‘D = hillsideplusnoise(6,20)’ creates a 6 by 6 matrix with 20% noise

added to the hillside graph

pyrplib.artificial.removelinks(D, percent)¶

CONVERT TO UNWEIGHTED

Function returns a modified version of D with percent of nonzero links removed

pyrplib.artificial.unweighted(D)¶

CONVERT TO UNWEIGHTED

Function returns an unweighted version of D

pyrplib.artificial.weakdomplusnoise(n, percentnoise, low=0, high=1)¶

function creates a weak dominance graph and adds noise.

Input: n = number of rows/cols in D matrix

percentnoise = integer between 1 and n^2 representing the: percentage of noise to add to D domgraph, e.g., if percentnoise = 10, then 10% of the n^2 elements will be noise

Example: ‘D = weakdomplusnoise(6,20)’ creates a 6 by 6 matrix with 20% noise

added to the dominance graph

pyrplib.base module¶

class pyrplib.base.DInfo¶

Bases: object

A class to represent information about a dominance (D) matrix.

property D¶

property D_type¶

property command¶

property dataset_id¶

static from_json(file)¶

Static method that reads a DInfo object from a JSON file.

Returns: Returns a DInfo object
Return type: DInfo

property source_dataset_id¶

to_json()¶

Returns a JSON string representing the object.

Returns: Returns a JSON string representing the object.
Return type: str

class pyrplib.base.HillsideCard¶: Bases: LOPCard

class pyrplib.base.LOPCard¶

Bases: object

A class that represents the analysis, results, and metrics associated with running LOP algorithm.

LOPCard can be saved as a JSON file that contains the following:

{
    "D": "<Dominance matrix and input to the LOP solver>",
    "obj": "<Optimal value of LOP>",
    "solutions": "<List of optimal orderings/permutations that result in an optimal value>",
    "max_tau_solutions": "<Two farthest orderings/permutations measured by Kendall tau (when available)>",
    "centroid_x": "<X*>",
    "outlier_solution": "<Optimal ordering/permutation that is farthest from centroid_x>",
    "dataset_id": "<Identifying ID>"
}

property D¶

add_solution(sol)¶

Adds a solution specified by a permutation/ordering.

Parameters: [sol] – [A permutation/ordering of type list or tuple]

property centroid_solution¶

property centroid_x¶

property dataset_id¶

static from_json(file)¶

Static method that reads a LOPCard object from a JSON file.

Returns: Returns a LOPCard object
Return type: LOPCard

property obj¶

property outlier_solution¶

property solutions¶

property source_dataset_id¶

to_json(file)¶

Returns a JSON string representing the object.

Returns: Returns a JSON string representing the object.
Return type: str

class pyrplib.base.MatricesInfo¶

Bases: object

A class to represent information about matrices M and b. i.e., MX=b

property b¶

property command¶

property dataset_id¶

static from_json(file)¶

Static method that reads a MatricesInfo object from a JSON file.

Returns: Returns a MatricesInfo object
Return type: MatricesInfo

property matrix¶

property source_dataset_id¶

to_json()¶

Returns a JSON string representing the object.

Returns: Returns a JSON string representing the object.
Return type: str

pyrplib.card module¶

class pyrplib.card.Card¶

Bases: ABC

The base Card abstract class.

property dataset_id¶

static get_contents(file)¶

Static method that reads a Card from a JSON file.

Parameters: [file] ([str]) – [file path or URL path to JSON file]
Returns: Returns a Pandas Series object
Return type: pandas.Series

load(dataset_id, options)¶

Load a Card using the dataset_id and the options.

Parameters

[dataset_id] – [Dataset ID]
[options] ([dict]) – [Dictionary of options]

property options¶

abstract prepare(processed_dataset)¶

abstract run()¶

property source_dataset_id¶

to_json()¶

Returns a JSON string representing the object.

Returns: Returns a JSON string representing the object.
Return type: str

abstract view()¶

class pyrplib.card.Hillside¶

Bases: LOP

A class that represents the analysis, results, and metrics associated with running Hillside algorithm.

Hillside finds the optimal solution in Hillside form:

Chartier, Timothy P., et al. “Minimum violations sports ranking using evolutionary optimization and binary integer linear program approaches.” Proceedings of the Tenth Australian Conference on Mathematics and Computers in Sport, A. Bedford and M. Ovens, eds., MathSport (ANZIAM), New South Wales, Australia. 2010.

Hillside and LOP share the same metrics and analysis.

class pyrplib.card.LOP¶

Bases: Card

A class that represents the analysis, results, and metrics associated with running LOP algorithm.

LOP card can be saved as a JSON file that contains the following:

{
    "dataset_id": "<Identifying Dataset ID>"
    "source_dataset_id": "<Identifying Source Dataset ID>"
    "D": "<Dominance matrix and input to the LOP solver>",
    "obj": "<Optimal value of LOP>",
    "solutions": "<List of optimal orderings/permutations that result in an optimal value>",
    "farthest_pair": "<Two farthest orderings/permutations measured by Kendall tau (when available)>",
    "tau_farthest_pair": "<Associated Kendall tau value (when available)>",
    "closest_pair": "<Two (not identical) closest orderings/permutations measured by Kendall tau (when available)>",
    "tau_closest_pair": "<Associated Kendall tau value (when available)>",
    "centroid_x": "<X*>",
    "outlier_solution": "<Optimal ordering/permutation that is farthest from centroid_x>",
    "method": "<Method which is LOP or Hillside>"
}

property D¶

add_solution(sol)¶

Adds a solution specified by a permutation/ordering.

Parameters: [sol] – [A permutation/ordering of type list or tuple]

property beta¶

property centroid_solution¶

property centroid_x¶

property closest_pair¶

property farthest_pair¶

static from_json(file_link)¶

Static method that reads a LOP card object from a JSON file.

Returns: Returns a LOP card object
Return type: LOP

get_visuals()¶

Returns a diciontary with both dash and notebook ready visualization.

Returns: Dash and notebook visuals
Return type: dict

property method¶

property obj¶

property outlier_solution¶

prepare(processed_dataset)¶

Prepare the data for analysis. For LOP this means filling in missing values in the dominance matrix and removing rows and columns with all 0’s.

Parameters: [processed_dataset] ([dataset.Processed]) – [Processed dataset object]
Returns: self
Return type: LOP

property r¶

Returns a rating vector using X*.

Returns: Rating vector derived from X*
Return type: pandas.Series

run()¶: Run the LOP analysis and compute the metrics.

property solutions¶

property tau_closest_pair¶

property tau_farthest_pair¶

view()¶

Returns a dictionary in dash ready format.

Returns: List of HTML dash ready objects
Return type: list

property xstar¶: Return X* as a dataframe using the row and column names of D.

property xstar_r_r¶: Return X* optimally reordered.

class pyrplib.card.SystemOfEquations(method)¶

Bases: Card

A class that represents the analysis, results, and metrics associated with solving a system of equations to produce a ranking.

SystemOfEquations card can be saved as a JSON file that contains the following:

{
    "dataset_id": "<Identifying Dataset ID>"
    "source_dataset_id": "<Identifying Source Dataset ID>"
    "M": "<Matrix from Mx=b>",
    "b": "<Vector from Mx=b>",
    "r": "<Rating vector>",
    "ranking": "<Ranking vector>",
    "perm": "<Ordering/permutation>",
    "options": "<dictionary of options>",
    "games": "<Games (or more generally matchups) that are processed to produce M and b>",
    "teams": "<List of teams (or more generally items)>",
    "method": "<Method which is Massey or Colley>"
}

property M¶

property b¶

static from_json(file_link)¶

Static method that reads a SystemOfEquations card object from a JSON file.

Returns: Returns a SystemOfEquations card object
Return type: SystemOfEquations

property games¶

property method¶

property perm¶

prepare(processed_dataset)¶

Prepare the data for analysis.

Parameters: [processed_dataset] ([dataset.Processed]) – [Processed dataset object]
Returns: self
Return type: SystemOfEquations

property r¶

property ranking¶

run()¶: Solve the system of equations and store the results.

property teams¶

view()¶

Returns a dictionary in dash ready format.

Returns: List of HTML dash ready objects
Return type: list

pyrplib.data module¶

class pyrplib.data.Data(DATA_PREFIX)¶

Bases: object

A class that facilitates accessing the datasets for RPLIB.

This class reads the following TSV files:

{DATA_PREFIX}/unprocessed_datasets.tsv
- Columns: Dataset ID, Dataset Name, Description, Type, Loader, Download links
- Dataset ID - persistant unique ID for each dataset
- Dataset Name - Short human readable name for the dataset
- Description - Longer human readable description of the dataset
- Type - Games|D matrix|Features|Structured Artificial
- Loader - Class that is used to load the dataset (e.g., marchmadness.base.Unprocessed)
- Download links - String of comma separated file links
{DATA_PREFIX}/processed_datasets.tsv
- Columns: Dataset ID, Source Dataset ID, Index, Command, Type, Collection, Options, Last Processed Datetime, Identifier
- Dataset ID - persistant unique ID for each processed dataset
- Source Dataset ID - source dataset ID
- Index - Index pointing into the source dataset to extract the specific value
- Command - Python functional code statement describing how to process the data. May assume the following variables: data and index.
- Type - resulting type of dataset (D|Games)
- Collection - Name of collection for organization in the data directory
- Options - JSON string of optional options
- Last Processed Datetime - Last time this dataset was processed/updated
- Identifier - Optional identifying string for the dataset
{DATA_PREFIX}/lop_cards.tsv
- Columns: Dataset ID, Processed Dataset ID, Options, Last Processed Datetime
- Dataset ID - persistant unique ID for each card
- Processed Dataset ID - processed dataset ID used as input
- Options - JSON string of optional options
- Last Processed Datetime - Last time this dataset was processed/updated
{DATA_PREFIX}/hillside_cards.tsv
- Columns: Dataset ID, Processed Dataset ID, Options, Last Processed Datetime
- Dataset ID - persistant unique ID for each card
- Processed Dataset ID - processed dataset ID used as input
- Options - JSON string of optional options
- Last Processed Datetime - Last time this dataset was processed/updated
{DATA_PREFIX}/massey_cards.tsv
- Columns: Dataset ID, Processed Dataset ID, Options, Last Processed Datetime
- Dataset ID - persistant unique ID for each card
- Processed Dataset ID - processed dataset ID used as input
- Options - JSON string of optional options
- Last Processed Datetime - Last time this dataset was processed/updated
{DATA_PREFIX}/colley_cards.tsv
- Columns: Dataset ID, Processed Dataset ID, Options, Last Processed Datetime
- Dataset ID - persistant unique ID for each card
- Processed Dataset ID - processed dataset ID used as input
- Options - JSON string of optional options
- Last Processed Datetime - Last time this dataset was processed/updated

load_card(dataset_id, card_type)¶

load_processed(dataset_id)¶

load_unprocessed(dataset_id)¶

save_colley_datasets()¶

save_hillside_datasets()¶

save_lop_datasets()¶

save_massey_datasets()¶

save_processed_datasets()¶

pyrplib.dataset module¶

class pyrplib.dataset.Processed¶

Bases: Unprocessed

Processed dataset labeled with a persistant and unique dataset_id

property command¶

dash_ready_data()¶: Returns dash ready data

property data¶: Returns a dataframe

property dataset_id¶

abstract static from_json(file)¶

property short_type¶

abstract size_str()¶

property source_dataset_id¶

to_json()¶

property type¶: Return the high level type of an element in data() as a string

class pyrplib.dataset.ProcessedD¶

Bases: Processed

Processed dominance (D) dataset object

static from_json(file)¶

Loads a ProcessedD file from a JSON file.

Parameters: [file] – [Path to local or http JSON file]
Returns: Returns a ProcessedD object.
Return type: ProcessedD

load(options={})¶: Load a processed dominance (D) dataset with options

size_str()¶: Size of dataset as a string

class pyrplib.dataset.ProcessedGames¶

Bases: Processed

Processed games dataset object

static from_json(file)¶

Loads a ProcessedGames file from a JSON file.

Parameters: [file] – [Path to local or http JSON file]
Returns: Returns a ProcessedGames object.
Return type: ProcessedGames

load(options={})¶: Load a processed games dataset with options

size_str()¶: Size of dataset as a string

class pyrplib.dataset.Unprocessed(dataset_id, links)¶

Bases: ABC

Unprocessed dataset labeled with a persistant and unique dataset_id

abstract dash_ready_data()¶: Returns dash ready data

data()¶: Returns a dataframe

abstract load(options={})¶: Code that loads the data from the links

abstract type()¶: Return the high level type of an element in data() as a string

view()¶: Standard view function for a dataset

view_item(index)¶: Standard view function for an item from a dataset

class pyrplib.dataset.UnprocessedType(value)¶

Bases: Enum

An enumeration.

D = 0¶

Features = 2¶

Games = 1¶

pyrplib.dataset.load_unprocessed(unprocessed_source_id, datasets_df)¶

Helper function to load unprocessed dataset.

Parameters

[unprocessed_source_id] – [Unprocessed dataset ID]
[datasets_df] – [Dataframe of datasets read from data.Data(DATA_PREFIX)]

Returns

Unprocessed dataset

Return type

dataset.Unprocessed

pyrplib.style module¶

pyrplib.style.get_standard_data_table(df, id)¶: Returns a dash data table with standard configuration.

pyrplib.style.get_standard_download_all_button(button_id, download_id, progress_id=None, collapse_id=None)¶: Return a standard download button.

pyrplib.style.view_item(item, id)¶: Helper function to view a single item.

pyrplib.transformers module¶

class pyrplib.transformers.ColumnCountTransformer(columns)¶

Bases: BaseEstimator, TransformerMixin

A class to convert a feature matrix to a dominance matrix in the standard sklearn transformer paradigm.

fit(X, y=None)¶

transform(X, y=None)¶

class pyrplib.transformers.ComputeDTransformer(direct_thres=0, spread_thres=0, team_range=None)¶

Bases: BaseEstimator, TransformerMixin

A class to convert games to a dominance matrix in the standard sklearn transformer paradigm.

fit(X, y=None)¶

transform(X, y=None)¶

pyrplib.transformers.count(games, teams)¶

Returns a processed direct matchup dominance matrix, processed indirect matchup dominance matrix, and the transformer used.

Parameters

[games] ([pandas.DataFrame]) – [DataFrame of games (matchups between items)]
[teams] ([list]) – [list of teams/items]

Returns

Tuple of processed D from direct matchups, processed D from indirect matchups, and the transformer

Return type

tuple

pyrplib.transformers.direct(D, ID, trans)¶: Returns the direct matchup (D) matrix from the arguments.

pyrplib.transformers.directplusindirect(D, ID, trans, indirect_weight=1.0)¶

Returns a processed D object that is a combination of D and ID using the indirect weight.

Returns: Processed dominance matrix that is a weighted combination of D and ID
Return type: processed_D

pyrplib.transformers.features_to_D(df_features, options={})¶

Convert a features matrix to a dominance matrix.

options[“columns”] = list of columns you would like to convert options[“items”] = list of items you would like to use. Items must be in the index

pyrplib.transformers.indirect(D, ID, trans)¶: Returns the indirect matchup (ID) matrix from the arguments.

pyrplib.transformers.process_D(D)¶: Returns a processed D object from a dominance matrix (pandas.DataFrame).

pyrplib.transformers.standardize_games_teams(games, teams, options={})¶

Returns a standardized version of games and teams with the expected column names as a ProcessedGames object.

options[“team1_name”] = column in your dataframe that has team 1 names options[“team2_name”] = column in your dataframe that has team 2 names options[“team1_score”] = column in your dataframe that has team 1 score options[“team2_score”] = column in your dataframe that has team 2 score options[“team1_H_A_N”] = column in your dataframe to specifies home = 1, away = -1, or neutral = 0 for team 1 options[“team2_H_A_N”] = column in your dataframe to specifies home = 1, away = -1, or neutral = 0 for team 2

pyrplib package¶

Subpackages¶

Submodules¶

pyrplib.artificial module¶

pyrplib.base module¶

pyrplib.card module¶

pyrplib.data module¶

pyrplib.dataset module¶

pyrplib.style module¶

pyrplib.transformers module¶

Module contents¶

RPLIB

Navigation

Related Topics