Monday, March 7, 2011
Oracle SQL Basics
Symphony KT
Citifinancial is replacing one of its Legacy Application(Maestro) with web based application (Symphony). The project is called Symphony. It is a 120 million project and we have around 200 people from different teams working in this project. We are working on an application called Lead Management which is one of the major enhancement that is being done to the existing loan booking system. This is a CRM application(Customer Relationship Management) which would eventually become a revenue generating application for CitiFinancial. There are different types of customers we work on which includes present borrower, former borrowers, private labels and majorly potential customers (prospects)
The data for all the customers is pulled from a centralized citi database called Account Master(existing production system) by a third party vendor called datalab and will be sent to the ETL team in a space formatted flat file. Before sending the data to testing team datalab will apply some algorithms to the data and will scrub the data so that the personal information like SSN, Phone number, names and account details will not be used for general testing purpose.
Once the file is received, testing team will look into the file and validate the data in the file. There is a field layout provided by the business which will have the characteristics of the field and their boundary values. Based on that the file will be validated using a Microsoft excel or Microsoft access. Once the testing team approves the validation on the data, ETL team will pull the file and using the Data Warehousing tool (Ab Initio) the data is loaded into the database. The loaded data will be validated agains the business requirements and business rules using SQL queries. We use TOAD to access Oracle Database and we write complex SQL queries to validate the data in the database. We will actually do a join between more than 10 tables and kind of create our own view and validate the date against the source flat file.
Once the data is loaded and validated, we will go ahead and start testing the web application. The web application that we were using is a product of Chordiant. Chordiant is a company that sells Out of the Box applications for different lines of business. SO our motive was to enhance the OOTB application based on our requirements. The input for this application is the Marketing data that was loaded by ETL earlier. We will pull the data and test all the functionalities, GUI and check whether the transactions have been updated in the DB. We use sql to validate the DB part. Once the Functionality testing, integration and system testing(which includes different other systems like ISW, EERS, HR Feed (all internal systems for Citi)), then we will move on to the Data Warehousing mode.
Now for Creating a Data warehouse the input is the Source table. We again use Ab Initio to extract the data from source based on the requirements and rules and the front end transactions, and will migrate the data to the Data warehouse (Reporting tables). The target tables are the reporting tables which is built based on Star Schema (Dimensions and Facts).
Testing team will validate the source data with the target data using SQl queries and joins and also the Business requirement and rules. Once the validation is done we log in to a BI tool called Cognos and validate the mapping between the front end and the back end. Once the Cognos reports are validated we have something called Cognos Cubes. We will have to validate the Cubes and cubes are mainly used for enterprise level reports as it takes the snapshot of the data and stores it within itself unless and until it is refreshed again. So the user can play with massive data and analyse the reports and create adhoc views of reports based on the need with no performance issues.
For all of these validations we will have to have a proper test strategy, test plan and test cases written before proceeding with the execution. The defects are logged in QC and testing team is responsible for the defect management and tracking defects to closure. If the business decides not to fix a defect for a current release as it is not so critical for launch, then that defect is backlogged for future releases.
We are involved in the full life cycle of the project right from Business requirement analysis, design document analysis, Drafting Test strategy (Please look at the test strategy of the LM project and get to know the components of test strategy and the basic understanding), test planning, effor estimation (will be based on the number of test cases per day per person, complexity of the test cases and the priority of test cases. Complexity and priority should be decided by the testing team only. Based on that the test scripts are assigned to the testers in the Quality center , this is task Management). If there are any downtime that will also be captured and tracked. Once all the processes are done and we are done with the project, then we will have to prepare an exit report. This will have all the downtimes, issues encountered, backlogged defects for future releases, functionalities completed, in scope functionalites not tested and approvals from project stake holders (Please have a look at the exit report to get an idea and know the components of the report).
Also we constantly interact with the customers, provide inputs and suggestions(make sure you have one good suggestion which you made to customer to answer this), attend defect calls, daily calls with Onshore folks etc.
Also you can include inbetween that you have been doing, functionality testing, integration, system, regression, automation(if any), User acceptance(by discussing with business and understanding the business logics and expectations), Unit testing on ETL (Drafting test cases from ETL transformations when there is no specific business requirement but a technical requirement to make sure the data loaded is valid)
Data warehouse Basics - Mani
Data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis [1].
Dimension table:
In data warehousing, a dimension table is one of the set of companion tables to a fact table.
The fact table contains business facts or measures and foreign keys which refer to candidate keys (normally primary keys) in the dimension tables.
The dimension tables contain attributes (or fields) used to constrain and group data when performing data warehousing queries.
Over time, the attributes of a given row in a dimension table may change. For example, the shipping address for a company may change. Kimball refers to this phenomenon as Slowly Changing Dimensions. Strategies for dealing with this kind of change are divided into three categories:
- Type One - Simply overwrite the old value(s).
- Type Two - Add a new row containing the new value(s), and distinguish between the rows using Tuple-versioning techniques.
- Type Three - Add a new attribute to the existing row.
Facts Table:
In data warehousing, a fact table consists of the measurements, metrics or facts of a business process. It is often located at the centre of a star schema, surrounded by dimension tables.
Fact tables provide the (usually) additive values that act as independent variables by which dimensional attributes are analyzed. Fact tables are often defined by their grain. The grain of a fact table represents the most atomic level by which the facts may be defined. The grain of a SALES fact table might be stated as "Sales volume by Day by Product by Store". Each record in this fact table is therefore uniquely defined by a day, product and store. Other dimensions might be members of this fact table (such as location/region) but these add nothing to the uniqueness of the fact records. These "affiliate dimensions" allow for additional slices of the independent facts but generally provide insights at a higher level of aggregation (a region contains many stores).
The star schema (sometimes referenced as star join schema) is the simplest style of data warehouse schema. The star schema consists of a few fact tables (possibly only one, justifying the name) referencing any number of dimension tables. The star schema is considered an important special case of the snowflake schema.
The facts that the data warehouse helps analyze are classified along different dimensions: the fact tables hold the main data, while the usually smaller dimension tables describe each value of a dimension and can be joined to fact tables as needed.
Dimension tables have a simple primary key, while fact tables have a compound primary key consisting of the aggregate of relevant dimension keys.
It is common for dimension tables to consolidate redundant data and be in second normal form, while fact tables are usually in third normal form because all data depend on either one dimension or all of them, not on combinations of a few dimensions.
The star schema is a way to implement multi-dimensional database (MDDB) functionality using a mainstream relational database: given the typical commitment to relational databases of most organizations, a specialized multidimensional DBMS is likely to be both expensive and inconvenient.
Another reason for using a star schema is its simplicity from the users' point of view: queries are never complex because the only joins and conditions involve a fact table and a single level of dimension tables, without the indirect dependencies to other tables that are possible in a better normalized snowflake schema.
Online transaction processing, or OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing. The term is somewhat ambiguous; some understand a "transaction" in the context of computer or database transactions, while others (such as the Transaction Processing Performance Council) define it in terms of business or commercial transactions.[1] OLTP has also been used to refer to processing in which the system responds immediately to user requests. An automatic teller machine (ATM) for a bank is an example of a commercial transaction processing application.
Online analytical processing, or OLAP (pronounced /ˈoʊlæp/), is an approach to quickly answer multi-dimensional analytical queries.[1] OLAP is part of the broader category of business intelligence, which also encompasses relational reporting and data mining.[2] The typical applications of OLAP are in business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas. The term OLAP was created as a slight modification of the traditional database term OLTP (Online Transaction Processing).[3]
Databases configured for OLAP use a multidimensional data model, allowing for complex analytical and ad-hoc queries with a rapid execution time. They borrow aspects of navigational databases and hierarchical databases that are faster than relational databases.[4]
The output of an OLAP query is typically displayed in a matrix (or pivot) format. The dimensions form the rows and columns of the matrix; the measures form the values.
At the core of any OLAP system is the concept of an OLAP cube (also called a multidimensional cube or a hypercube). It consists of numeric facts called measures which are categorized by dimensions. The cube metadata is typically created from a star schema or snowflake schema of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables.
Each measure can be thought of as having a set of labels, or meta-data associated with it. A dimension is what describes these labels; it provides information about the measure.
A simple example would be a cube that contains a store's sales as a measure, and Date/Time as a dimension. Each Sale has a Date/Time label that describes more about that sale.
Any number of dimensions can be added to the structure such as Store, Cashier, or Customer by adding a column to the fact table. This allows an analyst to view the measures along any combination of the dimensions.
ETL Steps
Datalab sends a space formatted flat file. The field layout is provided by the Business and based on the field layout the file is generated by the datalab and sent to the ETL team.
Testing team should validate the data in the file. We can use Microsoft access or Microsoft Excel to convert the space formatted flat file to a readable format and validate the layouts and the values in the file.
Once the data is validated, then it is loaded into the database using Ab Initio graphs.
Testing team should force start the ETL batch process and monitor them for results. If any graphs in the process fails then the tester should navigate to the error logs and validate the errors and rejected files. This is done through the Unix Box. The schedular for Ab Initio is called as Autosys which is used for running and monitoring the batch jobs in unix box.
The first step is GateKeeper process. The code and test scripts for this process is actually derived from the data model. Based on the data model the scenarios are validated.
Next the data is validated against the business requirements and business rules. Also the data integrity, the relationship between the tables and the data is validated. We will have to make sure there are no orphan records generated in any tables.
All the data validation is done using Oracle SQL queries. The tool can be either Toad or SQL developer for Symphony.
Once the data and data integrity is validated, we test the data through application, do a complete UI, functionality, regression validations and capture all the transactions (dispositions) in the database tables.
And this is validated using Oracle SQL queries.
Once the transactional data is validated, then this data is extracted, transformed and loaded into the reporting tables using Ab Initio graphs. This is the Data Warehousing process in the project. This is not a straight forward loading but there are some business rules and requirements based on which this is validated. This is again validated using Oracle SQL queries.
Once the data in the reporting tables is validated, then the data is tested in the front end using a BI tool called Cognos. The data in the reporting tables are loaded in multi dimensional format and the type of schema which is used is Star Schema for loading and retrieving data.
All these validations have a proper test strategy, test plan and test cases which is loaded in Quality Center and executed. Any issues is logged as a defect in QC and will be discussed and validated with the business owners and the development team.
Data Validation Roles and Responsibilities:
Validate the boundary conditions,
Validate the characteristics (Alpha, Numeric, etc..)
Not Null validations (Constraints and Triggers validation)
Data integrity (Relationship validations, Parent-child)
Validating the error records through error logs in Unix.
Writing joins in the DB(Oracle DB), Oracle SQL queries and we are actually joining more than 16 tables and creating a materialistic view and validating the data against the source files.
Validate Data mapping whether the correct column from source is mapped to the correct column in the DB
Write test strategy, test plans, test cases for ETL and Data Warehousing process
Use Cognos (BI tool) to validate the data in Business Intelligence.
Testing Concepts Basics
Test Strategy:
A test strategy is an outline that describes the testing portion of the software development cycle. It is created to inform project managers, testers, and developers about some key issues of the testing process. This includes the testing objective, methods of testing new functions, total time and resources required for the project, and the testing environment.
In the test strategy is described how the product risks of the stakeholders are mitigated in the test levels, which test types are performed in the test levels, and which entry and exit criteria apply.
The test strategy is created based on development design documents. The system design document is the main one used and occasionally, the conceptual design document can be referred to. The design documents describe the functionalities of the software to be enabled in the upcoming release. For every set of development design, a corresponding test strategy should be created to test the new feature sets.
Components of Test Strategy:
Purpose: Testing Scope and Approach
Out of scope: release or phase specific
Data Preparation: flat file, manipulation of data in SQL queries, ETL for web application
Test Environment: ETL, Dev3, Cordinati, Tibco
Defect Management and roles and responsibility: who is defect coordinator, who is manager, who is tech lead.
Entrance Criteria: pre requsite for starting the testing, Requirement must be freeze, documents sign off, application must be ready, Test strategy must be review & approved, there is should be any show stopper.
Exit Criteria:
· All high defects must be closed, all open, medium defects must be backed logged status, with comment from project owners.
· All the defect must have a status.
· Exit report must be prepared and approved by the mangers.
Risks and Contingencies: What if code deliver is delayed, so how to plan our testing dates or resources etc or unstable environment.
Planning Assumptions: Test Execution Schedule: how many test cases, how many rounds of test case is required. Need to plan for buffer plan.
Testing Deliverables and Artifacts
Share point.
Project stake holder’s review and approvals
Everything needs to be approval.
Test Plan:
A test plan documents the strategy that will be used to verify and ensure that a product or system meets its design specifications and other requirements. A test plan is usually prepared by or with significant input from Test Engineers.
Or
A high level test scenarios, Scope or understanding on the functionality
Test Cases:
A test case in software engineering is a set of conditions or variables under which a tester will determine whether an application or software system is working correctly or not.
A test script in software testing is a set of instructions that will be performed on the system under test to test that the system functions as expected.
A test case is also called as test case
Components of test case:
Brief Description
Pre Requisite
Data dependencies
Test environments involved
Actions
Expected result
Actual Restult
A software bug is the common term used to describe an error, flaw, mistake, failure, or fault in a computer program or system that produces an incorrect or unexpected result, or causes it to behave in unintended ways. Most bugs arise from mistakes and errors made by people in either a program's source code or its design, and a few are caused by compilers producing incorrect code.
If the actual result does not match with the expected result then it is called as a defect or Bug
Defect Life cycle:
Defect Life Cycle:
Newà Open à Review -> Fixed à Test Ready à Retest à Reopen -> Closed
New à Rejected (If the defect is not a defect)
How to determine Severity:
· High (with 4 to 6hrs)
· Medium
· low
· Show stopper(with in an hour, immediately)
Testing Methods:
Black Box
White Box
Grey Box
Defect Reports:
How many total test scripts, how many have been tested, how many passed, how many failed, how many application defect, how many environment defects.
Base document to prepare Test plan, test case.
Requirement Doc, Design Doc, Usecase & Artifacts
Test Levels:
Functional Testing Levels:
Unit Testing - “Unit testing” focuses on testing a unit of the code. A functionality, module etc..conditional statement, procedures.
Integration testing -This ‘level of testing’ focuses on testing the integration of “units of code” or components.
System Testing - System Testing’ is the next level of testing. It focuses on testing the system as a whole. Once the components are integrated, the system as a whole needs to be rigorously tested to ensure that it meets the Quality Standards.
Acceptance Testing (UAT)
Alpha testing & Beta testing: Staging database, before production, the tested coded will be deployed and then test environment.
Regression testing:
Smoke/Sanity Test: after every build,
Non Functional testing levels: performance, scalability, security
Performance or Load testing
Stability testing
Usability testing: user friendly
Security testing
Internationalization and localization
Testing processes
Waterfall development
Agile model
ETL Tool - Development: Abinitio
Database: Oracle 10g,
A sample testing cycle (STLC)
Although variations exist between organizations, there is a typical cycle for testing[33]. The sample below is common among organizations employing the Waterfall development model.
▪ Requirements analysis: Testing should begin in the requirements phase of the software development life cycle. During the design phase, testers work with developers in determining what aspects of a design are testable and with what parameters those tests work.
▪ Test planning: Test strategy, test plan, test bed creation. Since many activities will be carried out during testing, a plan is needed.
▪ Test development: Test procedures, test scenarios, test cases, test datasets, test scripts to use in testing software.
▪ Test execution: Testers execute the software based on the plans and test documents then report any errors found to the development team.
▪ Test reporting: Once testing is completed, testers generate metrics and make final reports on their test effort and whether or not the software tested is ready for release.
▪ Test result analysis: Or Defect Analysis, is done by the development team usually along with the client, in order to decide what defects should be treated, fixed, rejected (i.e. found software working properly) or deferred to be dealt with later.
▪ Defect Retesting: Once a defect has been dealt with by the development team, it is retested by the testing team. AKA Resolution testing.
▪ Regression testing: It is common to have a small test program built of a subset of tests, for each integration of new, modified, or fixed software, in order to ensure that the latest delivery has not ruined anything, and that the software product as a whole is still working correctly.
Test Closure: Once the test meets the exit criteria, the activities such as capturing the key outputs, lessons learned, results, logs, documents related to the project are archived and used as a reference for future projects.
Current Project description:
Brief about project - Symphony intends to facilitate thousands of CitiFinancial employees which include Branch Manager, District Manage and senior executives. Around 2,100 branches across US, Canada and Puerto Rico are expected to use this application to serve CitiFinancial customers. This application is designed to serve day to day activities like Registering Personal loan Application, Loan Payments, Product Advisor, and Cash Drawer, update Customer Information Manage Product Insurance etc. Symphony project will eventually replace many manual and tedious processes that branch employees currently follow.
functionalites – Payment & Cash Drawer end to end testing
Role - Test Lead - Testing Co-ordinator(Defect Coordinator, onsite coordinator)
Responsibility & deliverables, - Preparing Test cases, Test Review, Managing Offshore Team, Involve in Requirements Analysis & reviews, Involved during workshop & meetings.
my role: I’ve involved in the end to end testing. In this project I’ve to focus on two things mainly, ETL Testing & Web application. So for both, I’ve extensively worked in black box & gray box testing. The basic system infrastructure here is like web application, Action (Main frames) which is the key for customer data, Tibco (integration), Oracle database.
Customer data will be extracted from the existing database and it will be generated as a flat file in space delimiter formatted, and then it will be sent to ETL team. ETL team will pick up the file and load the data in to new database.
For ETL mainly I’ll prepare test script based on the standard layout and validate the data. then we will extract the flat files into excel using excel marco,
Web App we do both black & Gray testing. Need to make sure, payment relate information is feed into database.
We will split the user stories, sometime, so we do black box & gray box for both ETL & web application.
User Interface testing: UI Spec documents, icons, images, buttons label. (black box testing)
General Responsibilities: daily defect calls, daily reports, offshore call,
Web Application testing: Chordiant.
Validation: we will validate the data, whether it has been pumped in properly. Functional testing, we will have around 20 to 30 thousand record.
Data Reconciliation: Data mapping,
Data reconciliations element level checking where each element is valid. This includes matching the source and reflecting an accurate, valid value.
Database reconciliation focuses on the integrity and quality of the entire database or data set. Database reconciliation is a superset of data reconciliation.
We frequently use data reconciliation to address the integrity or accuracy of individual records of data quantities. Database reconciliation is frequently discussed at the completion of a data migration effort where we want to understand if the validity of the entire data set is still intact.
Data integrity testing. We will test it by writing SQL queries.
Migrating data from old database to new database, space delimited.
Gray box: Running sql queries in the database, (data manipulations, data validation), Any thing we create in backend called gray box testing.
White box testing: getting into the code and validating the code & conditions.(validated java code).