pq2 (1) - Linux Manuals

pq2: The command line interface to a dataset meta-repository based on ROOT files

Command to display pq2 manual in Linux: $ man 1 pq2

NAME

pq2 - The command line interface to a dataset meta-repository based on ROOT files

SYNOPSIS

pq2 action options

DESCRIPTION

This manual page documents briefly the pq2 program.

pq2 is a ROOT application providing an interface to a dataset meta-repository based on ROOT files; the repository can be accessed via the local file system or a remote file server daemon or a PROOF facility.

When working with a local or remote file system, pq2 instantiates a TDataSetManagerFile class on the specified local or remote directory. Remote access is done via the TFile interface, so any implementation of TFile supported by the installation can in principle be used. When working with PROOF server the TProof dataset manager interface is used to access the dataset repository attached to the PROOF cluster.

ACTIONS

ls: list compact information about all or a sample dataset.
ls-files: list compact information about all the files of a given dataset.
ls-files-server: list the file content of a dataset on a given server or list of servers.
info-server: display compact information about the datasets on a given server or a set of servers.
ana-dist: analyse the file distribution of a dataset (or a set of datasets) over the file servers, either in terms of files or of file sizes. The output is a text file with the the file movements needed to make the file distribution even in the chosen metrics to be used in input to pq2-redistribute. Optionally the internal objects can be saved so that they can be used as starting point for a subsequent run. Also an histogram and a plot can be saved to visualize the file distribution.
put: register one or more datasets.
rm: remove one or more datasets.
verify: scan the content one or more datasets.
cache: display or clear the local cache content.

COMMON OPTIONS

Some of the options listed below have a slight different meaning depending on the action. Please refer to the man pages of the script interfaces to the actions for more details (see below).

-d <dataset>: For all action but put, the dataset to be processed. For listing actions the wildcards '*' is supported. For action put, dataset is the path to the file with the list of files in the dataset or directory with the files containing the file lists of the datasets to be registered; in the first case wildcards '*' can be specified in the file name, i.e. '<dir>/fil*' is ok but '<dir>/*/file' is not. In all cases the name of the dataset is the name of the file finally used.
-u <serverurl>: URL of the PROOF master or data server providing the information; for data servers, it must include the directory. Can also be specified via the environment variables PQ2PROOFURL or PQ2DSSRVURL (see ENVIRONMENT VARIABLES)."
-o <options>: Specify a string of options to be passed to the instance actuatlly performing the action; the exact meaning is action dependent.
-s <servers>: Specify a server or a comma-separated list of servers to be used in the analysis when required by the action.
-k: Keep the temporary files created during the analysis under $TMPDIR
-v: Verbose mode

OPTIONS for action verify

The options listed below apply only to action 'verify'

-r <redirector>: Force re-location of the files via the specified redirector; useful after a file redistribution on the same file server.

OPTIONS for action ana-dist

The options listed below apply only to action 'ana-dist'

-m <metrics>: Defines the metrics to be used in the distribution analysis. The possibilities supported currently are: 'F' to use the number of files, and 'S' to use the file size.
-f <result file>: Defines the file where to save the result of the analysis; by the default the result is send to the screen. The output contains one line per each file that needs to be moved with the format 'file source destination' where 'file' is the file name, 'source' is the source server URL and 'destination' is the destination server URL.
--outfile <output file>: Defines the file where to save the output of the analysis in binary form (ROOT file); this output can be used as starting point for a next run, allowing to run over many datasets in separate steps.
--infile <input file>: Defines the ROOT file from where to fetch the output of a previous run (saved with --outfile).
-i <ignored servers>: Specify a server or a comma-separated list of servers to be ignored in the analysis; this allows to exclude, for example, the redirector.
-e <excluded servers>: Specify a server or a comma-separated list of servers to be excluded from the target servers; this can be used, for example, to determine the files movements to drain a server.
--plot [<plot file>]: Defines the file with the output plot with the original distribution with the server names and the +-10% limits; the extension (if known) defines the format; the default format is 'png' and the default name 'plot.png'. The plot can also be obtained directly from a binary output file (saved with '--outfile <outfile>.root') but just specifying '--infile <outfile>.root --plot'

ENVIRONMENT VARIABLES

See setup-pq2(1).

ORIGINAL AUTHORS

Gerardo Ganis for the ROOT team.

COPYRIGHT

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

AUTHOR

This manual page was originally written by Gerardo Ganis <gerardo.ganis [at] cern.ch>, for ROOT version 5.