The expertise represented here is the result of 38 years experience in most phases of power plant management, engineering and operations, followed by work as a consultant/expert witness. Along with a personal background statement, the site includes some original essays on various subjects of interest to power plant managers and engineers. These are intended as both a contribution to the industry and an insight into my operating philosophy.
The essays are for information only, and do not imply any commercial or advisory relationship with readers/users, who must be solely responsible for actions appropriate to their unique situations. They are “works in progress” and will be augmented and revised from time to time.
Included is a brief outline of my background and experience. A complete CV will be provided to qualified requestors who contact me via E-Mail.
Robert H. Bielecki PE, MSME
Of my 38 years electric utility experience, 20 were in engineering and management positions at large generating plants, and 18 in central office positions involved in their siting, design, and support.
The plant positions ranged from engineer to superintendent, and included significant supervisory time in operations, maintenance, and plant engineering. The plant types included base-loaded, coal-fired saturated and supercritical steam cycles, as well as major coal and oil-fired units subjected to two-shift operation. Three years were spent directing the initial preparation and staffing of a 2-unit nuclear station under construction, during which time I passed the classroom qualifications for reactor operator.
The central office experience was varied, including assignments during a 5-year stint in design engineering ranging from nuclear plant bid evaluations and siting studies to responsibility for installation of several gas turbine-powered peaking units and resolution of startup problems on a steam unit repowered with gas turbine drive.
Approximately 10 central office years were occupied directing the plant betterment efforts of the corporate generation department, which included responsibility for plant support not classed as maintenance or as the direct province of the corporate engineering department. This included management of the chemical laboratory, mechanical test section, and retrofit startup group. It also included responsibility for the application of uniform operating practices, emergency planning, plant performance evaluation, event analysis, coordination of operational and environmental compliance activities, and providing operational input to new and retrofit plant designs.
Three years were devoted to resolution of special problems, chief of which was the necessity of developing and directing a workable corporate asbestos control program.
I represented my employer for a total of 15 years in the EPRI (Electric Power Research Institute) advisory structure, serving on various committees in the plant operations and environmental control areas and participating in the 1983 EPRI-sponsored evaluation of Japanese vs. American power plants. I also served a total of 9 years on the PEA (Penna. Electric Association) Power Generation Committee.
I am a registered engineer in the State of Pennsylvania and a 38-year member of the ASME.
Post-utility consulting work as an investigator/expert witness has familiarized me with the design and operation of waste-to-energy plants.
I hold a BME from RPI and a MSME from Bucknell University. Among several training schools attended throughout my career, the most prominent are the previously mentioned Reactor Operator Qualifications Course and the MIT Reactor Safety School.
A STRATEGY TO COPE WITH POWER PLANT AGING
UNDER COMPETITIVE CONDITIONS
Ó 1997 R. H. Bielecki
Current conditions in the electric utility industry are nearly a worst-case from the standpoint of existing fossil plant usage and aging. Many plants are approaching their nominal design lives and are placed under market-driven economic restraints at exactly the time when, under conventional service strategies, they would receive major rebuilds. Yet a premature shutdown of large capacity stations could have severe financial consequences for the owner. The situation calls for keeping the plants performing reliably, while limiting expenses until the industry shakes out under new economic rules.
While performance improvement and operating economics remain important, the key to sustaining a plant through this transition is to avoid major equipment failures requiring costly long-outage fixes. These, particularly if unexpected, pose a “beggar’s choice” to the owner company, forcing the commitment of major funds piecemeal and making a shutdown decision, if warranted, more and more difficult.
Unfortunately, the logic of normal maintenance is very different from that needed to prevent major component failures, and limited resources impose the need for a system of priorities governing preemptive inspections and repairs.
This essay classifies major failures into three categories: imposed, dynamic, and progressive. From the author’s experience base, common shortcomings in dealing with each class are identified and measures to correct them are recommended as part of an overall strategy to combat aging. The emphasis is on preventing progressive failures, which stem from causes such as metal creep and do not generally show up in maintenance histories.
A system of setting inspection priority is outlined which rates plant component failures from the standpoints of cost, safety, and probability of occurrence.
Lastly, unique experiences are interspersed as examples throughout the report.
II. CLASSIFICATION OF FAILURES
Major equipment failures can be divided into three overall classes, each of which requires a different strategy of prevention:
· Imposed Failures - As the name implies, an agent outside the plant itself imposes these failures. They include acts of God (flood, storms, lightning etc.) as well as sabotage, and are generally coupled with an emergency condition. They also include regulatory failures (e.g. the imposition of regulations tighter than equipment operation will allow, forcing curtailment of service or shutdown).
· Dynamic Failures - The main contributors to this class, which occurs in the very short term and is caused by breach of good practice or equipment limits, are operator error and control system failure. Failures due to progressive deterioration are not included, even though they may show up as sudden events. The key is the degree of control exerted by the operator, and whether or not he/she had the ability to prevent the failure by his/her own action.
· Progressive Failures - are those due to the deterioration of component attributes with usage or conditions over time. They are the most difficult for owner-operators to guard against, since in many cases the progressive cause is hidden until it manifests itself in a disastrous way. Because of the severity of consequences, these failures require more concentration as equipment “ages”.
III. STRATEGIES FOR IMPROVEMENT BY CLASS OF FAILURE
Common shortcomings are first reviewed, based on the author’s observations and experience. The recommended remedies for these, taken in the aggregate and organized, become the strategy of an improvement program. The owner- must evaluate the cost/benefit of the recommended steps based on his own financial situation.
Imposed Failures
Human nature dislikes preparing for emergency situations, and the tendency is to “hide your head in the sand” until actually faced with one. However, it would be rare in a thermal power plant to operate even two or three years without a legitimate emergency taking place. Fires, floods, storms, lightning strikes and the like all have an impact on operations, and in many cases can result in drastic failure of major equipment.
There are several common shortcomings in dealing with the potential for emergencies:
· Operators are reluctant to thoroughly plan for emergencies, using more immediate concerns as an excuse.
· Meaningful emergency drills are frequently lacking or not taken seriously (operators alone are not to blame for this; management, except in nuclear stations, almost never gives emergency preparation and drill the priority it deserves). A certain degree of personnel and equipment hazard is present during a drill, but what is often ignored are the intangible benefits gained. These usually include a real improvement in operator confidence and knowledge of infrequently used systems.
· Also neglected, and of importance to the current subject, is the ability of good emergency preparation to minimize losses to equipment and recovery costs (an example would be the avoidance of hot steam lead chilling during a flood by building cooldown times into a flood emergency program).
· Usually, emergency procedures are prepared as part of a new plant’s organization. However, they quickly become outdated unless someone mandates their revision (as in the case of chemical spills and hazardous releases).
To remedy these shortcomings, a precondition is the full backing of plant management. Generally, people who have experienced a real emergency tend to have their awareness heightened. In fact, the period immediately after the emergency is a good time to update relevant procedures. However, this state of mind quickly wears off, and an appropriate strategy must insist on regular intervals for updates and drills, best done within a few months of maximum exposure.
For instance, annual flood procedure reviews and drills are best done in midwinter, prior to spring thaw. Another good time, at least in the Mid-Atlantic area, would be early summer before the hurricane season.
Scheduling drills during periods of peak demand is normally impractical. When a drill, such as a black start test, is in competition with running economics, it invariably loses out. When loose plans are made to accomplish it “during the first forced outage” they are usually scrapped in favor of an early return to service. In fact, it takes a real effort on the part of plant management to insist on these exercises, since they also must be present directing affairs at inconvenient times and in poor conditions.
Experience has shown the following to be a workable system:
For any given drill or interlock test, choose a “drop-dead” date: that is, a date that the drill must be completed by, even if a shutdown is required. Then set a time window for the drill a few months ahead of that date, and complete it during an outage of opportunity (weekend maintenance or forced). Give the drill equal weight with the priority jobs for the outage, and make sure it is not skipped for a poor reason. An example of a poor reason would be to insist on placing a unit in service Saturday night after a weekend tube repair, wasting a low-cost Sunday when drills could be performed. If the drills or tests reach their “drop-dead” date, complete them even at the cost of an outage. This will probably have to be done only once, since along with it, a check of wasted opportunities should be made, and if any are found, the manager responsible for neglecting them will have that fact reflected in his performance evaluation.
Obviously, the first concern in any emergency is safety, but a strategy that only takes this into account without planning to minimize equipment damage is incomplete. In many cases, and in particular that of flooding, the wise use of lead time can reduce the hazard to equipment without increasing danger to people. Enough effort must be spent in emergency planning, building likely “what-if?” scenarios into the emergency procedures.
Lastly, the simple expedient of enforcing an annual review and update of each emergency procedure and assigning it to a particular job title (not individual) should suffice; but, the key to this, as in each of the above cases, is full management support. Because the consequences of many emergencies are potentially so severe, management cannot afford the luxury of a “Mr. Nice-Guy” approach.
Dynamic Failures
All fossil fired power plants have (or should have) detailed operating procedures dealing with equipment startup, shutdown, and running rules. Also, main control systems are generally maintained passably, if only because of their importance to current operations and the interest of many new instrument maintenance people in digital controls.
However, when viewed from the standpoint of avoiding major failures, the following shortcomings in operator preparation are common:
· Operators are inadequately trained to operate systems in a “degraded” condition (with partial or total control failure). The better and more reliable the controls, the less practice operators receive and the more likely they are to err when faced with a “degraded” situation. Once such a situation is in progress, there is usually the potential to drive it deeper through lack of understanding. Frequently, operators are left on their own, and because of pride stay with a bad situation too long, worsening the effects of an initial failure.
A case in point: In a large natural-circulation boiler running at full load, a defective steam drum safety valve began to blow, first intermittently and then continuously. The control room operator elected to maintain his boiler load, reasoning that the steam loss and the noise involved (the plant was in a relatively remote rural location) were not significant when compared to the lost megawatts of a shutdown. After a few hours, massive overheating failures occurred in the upper region of the waterwalls. The operator did not understand that the additional steam flow passing through the safety valve increased the waterwall (not superheater) flow by 20%, reducing the circulation rate and resulting in waterwall DNB (departure from nucleate boiling) with rapid tube overheating.
The key to prevention in these cases does not lie with the operator! Rather, management’s failure to analyze situations, set limits to operator action, and adopt a fail-safe philosophy for operations are mainly at fault. Operators who save unit service without serious consequences, even though they took unwarranted risks to do so, are rarely reprimanded…they are generally regarded as heroes. One of the hardest things for many power station people, operators and supervision, to accept is the occasional need to sacrifice production in the interest of equipment preservation.
Let’s look at another example: The control room operator of a large once-through supercritical boiler, having six coal pulverizers and operating at full load, lost all the coal feeders simultaneously because of a control malfunction. This rare event instantaneously interrupted all fuel flow to the boiler, which, having no steam drum to provide heat capacitance, quickly lost waterwall and steam temperature. The operator, with all control systems tripped to “remote manual”, did a magnificent job of quickly running back turbine load, initiating a stable oil fire and recovering enough coal feed to reverse his temperatures; thus avoiding a disastrous turbine water-induction by the narrowest of margins. The unit was quickly restored to full load and the operator commended for his actions.
Viewed from the perspective of avoiding major failures, the operator did exactly the wrong thing! During his recovery, he was at serious risk of a boiler explosion, one or more pulverizer explosions, and a turbine-wrecking water induction. His safe move would have been to trip the turbine, purge the boiler and start from scratch. This would have cost at least a day of production or more if loaded pulverizers required emptying by hand. But, it would have been consistent with a philosophy of reducing risk and extending unit life.
· Many protective system and interlock tests are neglected in the interest of continued operation, unless required by code or law. Yet these are the very devices that do most to prevent major damage from failures.
· Often, the maintenance of basic local instrumentation (sight glasses, bourdon tubes, ammeters, thermometers etc.) is neglected as distributed control systems are adopted. In cases of control failure, however, they become the operator’s only status indication. The mind-set that considers control failure not credible because of redundant control components and “non-interruptible” power supplies is a booby-trap, as experience has shown.
Case in point: The first “black-start” test after installation of a new distributed control system for two units along with a “non-interruptible” power supply for the system consisting of solid-state inverters with redundant components and an independent battery bank. The key moment in the test: Simultaneous interruption of normal power supply to the unit auxiliary busses (from the yard transformer; both units were out of service). Result: Three of six vital control room CRTs blank. Reason: Failure of several defective cards in the “non-interruptible” power supply when battery load was applied at the moment of simulated power failure, notwithstanding that the power supply had been routinely tested in accordance with manufacturers specs.
· The final common shortcoming is the lack of casualty training for fossil plant operators. While fossil plant simulators exist, they are not nearly as common as those for nuclear plants, and are mainly used for basic training in startup and shutdown. Operators, particularly at base-load fossil plants, can go several years without seeing a particular kind of failure, and suddenly be faced with it under the worst conditions. With the availability of PC-based simulation, it’s reduced cost and increased power, there is less and less justification for allowing this situation to continue.
What strategies aid
in avoiding major dynamic failures?
To combat the above shortcomings, management should accept the need to indoctrinate operators in conservative principles of operation. These include:
· Educating operators in the degree of degradation acceptable before a unit should be safely shut down. This should be done on a situation-by-situation basis as much as possible, but a general understanding of an “envelope of limits” should be imparted so that no operator has an excuse for riding out an untenable situation.
· Authorizing movement to a safe load point when faced with control failure or unpredictable variation of key parameters.
· Exercising operators, particularly at base load stations, in operation of systems in “remote manual” or “manual” mode, consistent with the principle of the first point above. That is, if management authorizes a particular “degraded” operation to avoid shutdown, then the moves needed to accomplish it should be practiced, either live or by simulation. This could lead to some hard choices.
For instance: Most boiler feedwater systems are equipped with remotely operated full size regulator bypass valves. In many cases these are not operated at all, because their seats tend to cut, and the bypass valve cannot be isolated. If the feedwater regulator fails and has to be isolated, the bypass must be used or the unit must be shut down. Without any practice, normal load changes may be risky to equipment while on the bypass. For the sake of the equipment, management should specify more conservative operating limits at this point, after fully thinking through the hazards. The correct choice in this case may be an absolute prohibition of bypass operation, since the practice needed to develop operator confidence would probably result in valve seat failure, expensive valve repairs, and a forced outage.
Key backup and local instrumentation should receive the same maintenance priority as the elements of the distributed control system. While it is probably too much to ask that all local instrumentation be maintained, an inventory of instruments required in the authorized “degraded” states should be made, and their operability and accuracy assured.
A careful review of interlocks and protective systems should be made, noting the status and test intervals recommended by the manufacturer. An evaluation of the test frequencies, taking all factors into account but weighing heavily in favor of the equipment, should be used in setting the final intervals. Once set, the intervals should be enforced (see p.3). Most interlock and trip tests involve some degree of risk to the equipment, but in many cases this is due to lack of practice. Some risk should be acceptable in view of the larger objective to protect against the major failure.
Progressive Failures
Time is the key element in progressive failures, working both for and against extended unit service life. The causes usually act over a period of months or years, and are accelerated or delayed by the conditions of operation. While the owner-operator can expect progressive problems to multiply and become more severe with time, he also has that time to use wisely in preparing for their onset.
Most shortcomings in dealing with progressive failures involve the almost irresistible tendency to concentrate resources on current or history-based problems, excluding everything else. As industry competition increases, there will be an even greater impetus to spend maintenance effort on only the short-term problems and put off major fixes.
Almost all availability improvement programs are history-based. That is, the frequency of past equipment problems dictates the allocation of resources. A base design availability is assumed; however, progressive failures strike directly at the base design (e.g. code design of a high pressure/temperature pipe assumes a certain yield strength; metal creep over time reduces that yield strength, negating the design basis). Many progressive problems have no previous history and yet result in drastic, sudden failure.
Other major shortcomings in this area include:
· the lack of a reasoned, comprehensive means of setting inspection and evaluation priorities for problems with no history.
· the tendency to skew priorities toward problems susceptible to new “gadgetry”.
· refusal or inability to use knowledge already available. For instance, symptoms related to progressive problems may be noticed and even correctly evaluated by local forces, but do not find their way into maintenance planning because of organizational separation or poor communication.
· insufficient consideration of progressive failures in setting operating limits and ramp rates. Manufacturer’s recommended limits assume that design parameters remain unchanged over time, and operating instructions based on them may not be conservative enough.
At this point, let’s digress to look at some typical progressive failure mechanisms:
|
Progressive Failure Mechanism |
Components Affected_(typical) |
Failure Accelerators |
Failure Decelerators |
|
|
|
|
|
|
Creep |
Turbine HP rotors, IP rotors, casings, Boiler SH & RH headers, MS piping & tees, hot RH pipe |
Operation at steady state temperatures above design; operation at stress levels above design |
Temperatures and stress levels lower than design |
|
|
|
|
|
|
Temper embrittlement (shift in brittle-fracture temp. upward) |
Boiler headers & drums, HP/IP rotors & casings, blades, comb. Turbine disks |
Extended operation in the embrittlement range (650F to 1000F) |
Minimize time in the range. Operate to hold stress low until above b-f temp. |
|
|
|
|
|
|
Fatigue (thermal) |
MS piping, Boiler SH & RH headers, IP piping, Stop valves, turbine rotors, casings |
Rapid temperature cycling & oscillation Chilling during shutdown; poor SH & RH temp control |
Reduce load change ramp rates & reversals. Improve temperature control |
|
|
|
|
|
|
Fatigue (vibration-induced) |
Blades, Rotors (all), structure, Generator windings, boiler ID & FD fans |
Extended resonant operation (critical speeds); poor balance |
Minimize cumulative time at critical speeds; hyper-balance |
|
|
|
|
|
|
Fatigue (corrosion-accelerated) |
IP/LP rotors & expansion joints, crossovers, casings, boiler fireside components |
High solids carryover in steam; high CL levels in steam |
Maintain FW purity, reduce chlorides; maintain oxidizing atmosphere in boiler firebox |
|
Flashing Erosion |
Valve bodies, piping downstream of heaters, valves, startup systems |
Heavy cycling, hard throttling service |
Better material, less throttling service |
|
Electrical insulation breakdown |
Transformers, generators, cables |
Moist environment, dirt, motion, thermal oscillations, insul. Oil impurities |
Clean, dry, undisturbed environment, steady temperatures |
|
Thermal insulation deterioration (asbestos) |
All thermal insulation installed before 1970 |
Motion, skin deterioration, impact |
Renew containment, housekeeping |
Remedies for progressive problems fall into three (3) basic categories:
1. Building consideration of progressive problems into routine outage maintenance; and performing inspections, tests and analyses based on a reasoned, comprehensive priority system.
2. Adopting and enforcing conservative operating attitudes and physical limits based on the recognition of progressive failure mechanisms.
3. Insisting on good general practices that have been shown to minimize the progress of progressive failures (e.g. lubrication, balance)
A Workable Priority System
If an organization had infinite resources and time, every power plant system and component could be quantitatively analyzed for progressive failure potential … there would be no need to choose areas of concentration or to compete with “normal” maintenance priorities. This obviously not being the case, it’s incumbent on the owner-operator to develop a scope of activities that gives him the best chance of avoiding progressive “life-threatening” failures that his limited resources can accomplish. His strategy must address the following questions:
· What components must I consider?
· What information do I need?
· What components take priority?
· What techniques and methods are required?
While there are any number of possible priority systems, the “grid method” outlined here has been successfully used in the past, and serves to illustrate the principles needed to solve the problem. All systems require some judgment and subjectivity as well as sound science…the trick is to provide a comprehensive framework that combines these into an effective effort.
The Logic Behind the Process:
A progressive failure should be measured against the yardsticks of cost, safety, and loss of use when determining the priority of remedial efforts.
Cost includes all repair costs (parts, labor, overhead and contract).
The likelihood of hazard to life and limb if a progressive failure mechanism goes unchecked until the point of sudden failure is the measure of the safety criterion.
The loss of use criterion can also be expressed as a substitute power cost, but it represents a broader consideration; for instance, failure of a generator rotor may involve waiting up to a year for production of a replacement, the major factor being production of a rotor forging. Facing a year of downtime, the owner would be under extreme pressure to utilize that time for an expensive rebuild, or to decide on decommissioning. Even a choice to limit the work to necessary repairs still involves personnel deactivation and layup costs (an administrative nightmare). These concerns leverage inspection priorities upward for components needing long-outage repairs.
For each component, answer the following questions:
· What is the likely progressive failure mechanism(s) acting on the component?
· If the component fails suddenly, what is a ballpark replacement and repair cost?
· If the component fails suddenly, what safety hazard is likely?
· If the component fails suddenly, what downtime for recovery and repair is needed?
Obviously, a thorough answer to these questions could involve a full-fledged investigation at great expense; and while this could be justified in a few cases, it is impractical in the great majority. So, what can be done to come up with reasonable priorities?
The “grid” approach outlined here depends on acceptance of a few postulates:
First, in addition to scientific, quantitative analysis, objective and subjective opinions of knowledgeable people have significant value. The more relevant the experience and knowledge to a problem, the closer the opinion will be to the analytical answer. In fact, the ability to apply local knowledge and judgment may occasionally make the opinion superior to the analysis.
Case in point: In a large supercritical cycle, a 10” high-pressure heater drain was given routine inspection priority by maintenance planners. Operators had observed that during startups and low load changes the line frequently oscillated one or two feet. Local supervision assumed this was due to steam hammer because of control malfunction of the heater drain valve. However, they dutifully fed this information back to the planners, who took a harder look and discovered a number of problems which acted to increase inspection priority: a) a design error in setting the heater elevations, 2) severe pipe thinning downstream of the drain valve (flashing erosion), and 3) high combined stresses at connections because of inadequate pipe supports and restraints.
Second, the aggregate opinion of several knowledgeable, experienced people is generally more correct than a single opinion. This is particularly relevant and true in incorporating industry experience into priority judgments. Unlike the nuclear world, there is no comprehensive system of industry-wide experience review for fossil plants. Some information is exchanged at EEI meetings and the like, but it is incomplete, and will become more so as competition takes hold. Even if it were comprehensive, fossil plant management does not have the resources to devote to such a massive job of sifting information. However, the probability that one or a few of several people will be acquainted with a variety of individual industry events is fairly high. Providing a mechanism for this knowledge to be incorporated, while not 100% foolproof, should result in a better priority system.
Third, a forced ranking of equipment priorities, even with some subjectivity, provides a necessary discipline in the process, imposing some risk and responsibility on the planners and responsible engineers.
The “Grid” Method
of Setting Inspection Priorities
Step A: Break the power plant into major components based on the consideration of possible failure mechanisms. A first cut would list large block components, while later cuts may break these up based on analysis and judgment. An initial component may be main steam piping; later, the piping may be segmented into a) girth welds and heat affected zones, b) straight sections, and c) bends. Further segmentation could occur after a stress analysis. As an illustration, an initial list might be:
|
HP rotor |
IP rotor |
Generator rotor |
|
HP Casing |
IP casing |
LP rotor |
|
Condenser |
MS piping |
FW piping |
|
BFP |
Condensate Pump |
Hot RH piping |
|
Cold RH piping |
Boiler Drum |
Outlet SH header |
|
ID Fans |
FD Fans |
Int. SH header |
|
Blr waterwall |
Blr Superheater |
Blr Economizer |
|
Main transformer |
Structural steel |
Ground grid |
|
Generator stator |
Turbine blading |
Switchgear |
|
MS stop valves |
Turbine throttle valves |
RH intercept valves |
|
Pulverizers |
Silos |
Hoppers |
|
Stack |
Precipitator |
Breechings |
Step B: Create the first of three (3) Priority Grids:
This grid has as an ordinate the criteria cost of repair; the abscissa, common to
all grids, is probability of failure.
In neither case will quantitative values be necessary. The important thing is
to rank the components relative to each other, each component occupying a block
in the grid. The ranking can be accomplished in several ways, one of which is
to poll a number of knowledgeable people, weighting their rankings with expert
opinion and existing quantitative data on the likely progressive mode of
failure. Using the sample component breakdown above, a “cost grid” would
resemble the following:
The grid array can be as large as one wants. A larger array allows for grouping of components with nearly the same priority. The diagonal line represents a rough break between items that would be included in a progressive failure (technical) inspection program and those subject to normal maintenance priorities. The line can shift up or down depending upon many factors:
· Resources available to devote to technical inspection
· Willingness to take calculated risks
· Confidence in normal maintenance inspections
· Manufacturer’s advice
· Confidence engendered by adoption of conservative operating limits
The ranking in the above grid is for illustration only, and not to be construed as a definitive recommendation for the reader. However, let’s take one component out of the grid and examine typical reasoning that could be used in assigning its relative priority:
Looking at the HP casing, what are the likely modes of failure in an older unit, and how do they manifest themselves? At least part of the casing is subject to throttle steam temperatures…therefore, over a period of time metal creep can be expected to occur, most likely at stress concentration points (risers) in the highest temperature zone. An external indication of creep is distortion and cracking at the stress risers. Concurrently, thermal fatigue can be a major problem, particularly for units subject to cycling service affected by occasional water inductions. Since the casing has no moving parts, dynamic failure mechanisms such as fatigue are less likely.
Factor in at this point any recollections or available records of unusual aggravating events or conditions (e.g. a severe water induction, carrying throttle temperatures significantly higher than design).
The probability of ultimate failure is nearly certain with time, since creep is unavoidable at current throttle temperatures. However, drastic failure is unlikely and casings can be weld-repaired up to a point. Repairs will be expensive, but not inordinate. Replacement at great expense will ultimately be necessary, but some years’ notice is probable.
The final position in the priority grid seems to recognize the above reasoning.
Step C: Create the Safety Priority Grid
Assuming each failure is sudden (where that is possible), component failures pose varying safety hazards to personnel…these depend on the potential magnitude of energy release, proximity and numbers of people, rapid egress capability, ability to limit or stop the failure effects etc. Obviously, pressure part failure presents the greatest hazard, followed closely by destructive overspeed and furnace explosion. The latter two are the effects of dynamic failure mechanisms and do not belong in this priority grid.
A grid using the same components, but treating degree of hazard as the ordinate, might look like:
There is no intent in this ranking to augment or conflict with any safety program, which should have as its goal the elimination of accidents by any practical means. All equipment can be involved in safety, and one can always postulate an accident or hazard involving even the most innocuous component. The ranking is strictly for the limited purpose of weighting inspection priorities.
Step C: Create the
Loss-of-Use Priority Grid
Major component failures, particularly sudden ones, require outages of varying duration for repairs. More than monetary loss is involved if the outage is very long (see p.9). In addition to it’s usefulness in setting inspection priorities, this grid may inspire some preemptive measures to save outage time in the event of a failure (e.g. cooperating with several utilities with nearly similar generators for the common purchase of a spare rotor forging; pre-thinking rigging methods and routes; locating independent shop facilities with rapid turnaround capability).
Step D: Arrive at Final Inspection Priorities
Of course, other grids could be created to influence final inspection priority. Three that come to mind would be an environmental impact grid, a regulatory impact grid (not the same), and a vendor’s support grid. In fact, any important criterion could be introduced into the mix.
Assigning a final inspection priority is a judgment call, which looks at and evaluates the relative priorities of the grids, weighting and combining them to arrive at final priorities. All the system does is to insure consideration of all significant components, ask the right questions, and force the development of consistent, relevant information.
In addition to pointing up components to be inspected, the priority system aids in specifying the depth of inspection required and the quality of results acceptable.
Let’s look at two components at either end of the priority spectrum:
Hot Reheat Piping ranks very high on all three grids…an operative creep mechanism is likely, and if failure occurs, it will probably be widespread in the girth and seam weld heat-affected zones. Repair would involve extensive and expensive rigging, cutting, welding, and heat-treating operations. The repair outage would last several months, as a minimum. Energy release from a sudden failure would be extreme, as would be the hazard to personnel. The industry has a history of deadly hot reheat failures (Mojave and others).
All these factors indicate not only inclusion in the technical inspection as a high priority item, but the need for high quality results. Sophisticated techniques would be justified, used in combination with simpler, cheaper methods. A baseline inspection would probably involve complete visual and dye-check scans of all welds and heat affected zones, with ultrasonic and X-ray of areas showing distress or welds under high combined-stresses.
At the other extreme, while breeching and duct failure could be widespread due to corrosion and expansion problems, potential energy release is moderate (except in the case of furnace explosion, a dynamic failure), and repairs usually can be spaced out over several outages. Visual inspection, perhaps augmented by thermography, is all that should be required.
A final plan with priorities, techniques and frequencies might look like:
|
Component |
Priority (110 hi) |
Technique |
Frequency |
|
Hot RH Piping |
10 |
Vis., dye-ck, X-ray |
Baseline, then 2yrs |
|
MS Piping |
10 |
Vis., dye-ck, X-ray |
Baseline, then 2yrs |
|
HP rotor |
9 |
Bore UT, mag. part. |
Baseline, then 5yrs |
|
IP rotor |
9 |
Bore UT, mag. part. |
Baseline, then 5yrs |
|
Generator rotor |
9 |
Bore UT, dye rings |
Baseline, then 4yrs |
|
HP casing |
7 |
Vis., dye, replication |
Baseline, then 2yrs |
|
Coal silos |
7 |
D-meter, visual |
Baseline, then 2yrs |
|
T/G blading |
8 |
Vis, magnetic particle |
With rotor |
|
Main transformer |
8 |
Gas tests, hi-pot, vis. |
Annual |
|
Boiler drum |
8 |
Vis, dc penetrations |
Baseline, then 2yrs |
|
LP rotor |
7 |
Bore UT, mag. part. |
Baseline, then 4yrs |
|
T/G throttle valves |
6 |
X-ray, dyecheck |
Baseline, then 4yrs |
|
RH intercept valves |
6 |
X-ray, dyecheck |
Baseline, then 4yrs |
|
Superheater |
6 |
Vis, D-meter, sample |
Sections annually |
|
SH outlet header |
7 |
Vis, X-ray, UT |
Baseline, then 2yrs |
|
SH intermediate header |
5 |
Vis, UT |
Baseline, then 6yrs |
|
Boiler waterwall |
5 |
Vis, D-meter, sample |
Sections annually |
|
Precipitator |
5 |
Visual, spark |
Annually |
|
Breechings & ducts |
4 |
Visual, thermograph |
Annually |
|
|
Inspection program |
|
|
|
|
|
|
|
|
------------------------ |
------------------------ |
------------------------ |
------------------------ |
|
|
|
|
|
|
|
Normal maintenance |
Inspections |
|
Working the System
Whether applied in a single power plant or over the entire generating system, an inspection program geared to progressive failures requires a high degree of coordination. This is probably only achievable by dedicating someone full time to the task. Lets call this person the "Coordinating Staff Engineer"...assuming he/she would be applying the "grid", the individual's responsibilities would include:
Assembling enough design information to prepare an original component list and first drafts of the appropriate priority grids. Accuracy would not be critical at this point. The main objective is to develop a "straw man" for several people to operate on, and to provide a focus for subsequent logic.
Reviewing service records to begin the process of tuning the priority grids. This does not envision a painstaking compilation and review of every piece of historic data relative to the components; it would be too time consuming and labor intensive. Looking at available summaries, such as station performance reports, for events or parameters involving the components is probably sufficient. The targets would be events that indicate a progressive failure mechanism may be operating, or that conditions imposed on a component may significantly deviate from design.
Canvassing knowledgeable people for input to the priority grids. The coordinator should spend some time identifying people in the organization who have experience with and knowledge of listed components. Also, people with some knowledge of potential failure mechanisms should be included. The search should not be limited to those currently at the plants or in positions at working or engineering levels.
For instance, if a person was once a plant maintenance supervisor, but had advanced over the years to the management of an unrelated department, it is entirely possible for his past experience to be relevant in setting priorities. If that is the case, he should be included in the process, and the organization's mores should be such that he is expected to contribute.
The canvass inquiry can take several forms. A questionnaire could be prepared based on the initial components; the grid itself could be transmitted to participants, asking them to adjust ranking in accord with their special knowledge; meeting formats could be used to obtain the input; if available, an inquiry to industry data bases (EEI, EPRI etc.) could provide additional insight. In essence, the coordinator should be talented and open-minded enough to incorporate and organize information from many diverse sources.
· Preparing the final priority grids based on all inputs received from the participants.
Determining the actions necessary to make the inspection process efficient and cost-effective. Included here are measures to reduce the scope and cost of inspections without materially affecting the quality of their results. As an example, the conduct of a combined stress analysis on a high-temperature piping system may serve to identify the points of maximum risk. Sophisticated inspection methods could be confined to these areas, using more economical techniques (visual, dye-check) for the balance of the pipe.
Directing the activities of "Discipline Engineers" whose task would be to develop an Inspection Specification for a component or group of components
As opposed to the Coordinating Engineer, The Discipline Engineer’s responsibilities relate only to his assigned components, and he is expected to:
a) Determine the progressive failure mechanisms subject to inspection on the particular component assigned.
b) Specify the locations to be inspected.
c) Specify any non-destructive or destructive tests or analyses required.
d) Specify the inspection methods and techniques to be used.
e) Working with the Coordinating Engineer, make the necessary adjustments to insure maximum inspection efficiency and cost-effectiveness.
Once a realistic Inspection Specification covering the desired components exists, the final scope of outage inspection work must be negotiated with the plant supervision responsible for the outage. This is absolutely necessary in order to obtain commitment and an equivalent priority with other outage work. The inspection program would be worthless without this step, since it would automatically assume a lesser priority than immediate maintenance activities.
Ideally, the Discipline Engineers would direct and assist plant maintenance forces and contractors in performing the inspections. The critical activity at this stage is the identification of problems requiring immediate action, in order to keep as many maintenance options open as possible.
At this point, a very simple but profound statement applies: If you look for progressive failure mechanisms in older plants, you will undoubtedly find some. In certain severe cases, the resulting downtime could be the same as if a failure had occurred. The plus side of this situation is the elimination of the safety hazard and consequential damage associated with sudden failure. The minus is the need, because of knowledge, to take corrective action that may be premature but cannot be avoided. This is particularly unavoidable if safety or code concerns are involved. At the very least, further investigation, usually expensive and time consuming, would be necessary.
After the inspection, the Discipline Engineers would compile and analyze the results, and collaborate with the Coordinating Engineer in preparing a final report for feedback to the plant and the owner’s decision makers. A track of condition would be started and kept on the inspected components, and future inspection specifications would be adjusted based on this hard data.
General Practices
In addition to specific strategies aimed at correcting shortcomings in the prevention of major failure, there are certain general practices, based on common sense, that operate to greatly reduce the probability of entire classes of failure without being targeted at any specific failure mechanism.
· Interlock Checks
Interlocks, as defined here, are devices provided by either the equipment manufacturer or system designer whose sole function, acting independently of human or automatic control, is to prevent failure caused by conditions which exceed designed operating ranges. A true interlock is not a warning device, although alarms and warnings may be provided to anticipate its function. It removes control from the hands of the operator or normal control system, overriding it to bring the equipment to a safe physical state, usually by shutting it down. Interlocks also insure the sequential or simultaneous shutdown of related auxiliary equipment (e.g. Fans, pumps) as well as the triggering of systems or equipment whose sole function is to mitigate abnormal conditions (e.g. DC bearing oil pumps).
The best way to think of interlocks is as providing an envelope within which control or manual operations are permissible, having free scope within appropriate operating norms (ramp rates, temperatures, pressures etc.). A classic interlock is the turbine overspeed trip device, usually a spring-loaded throw weight attached to the turbine shaft, activated by centrifugal force to operate a control oil dump, allowing the stop valves to close. It is positive, close to the physical process, and independent of the control system.
Commonly, operators tend to neglect interlock tests for a number of reasons, some of which are cited here:
1) The plant staff lacks knowledge due to personnel turnover and unfamiliarity with manufacturer’s instructions and design intent. People involved in the startup of a new unit generally are cognizant of new equipment limitations and requirements. In fact, appropriate starting and testing practices include the initial functional tests of all interlocks. At this time, voluminous manufacturer’s instructions are studied with interest and enthusiasm. Procedures for test and operations are written which may be unrealistically stringent and detailed. After a time, however, short cuts are generally developed for reasons of convenience. New engineers and operators replace the originals, and tests perceived as unimportant or risky are done less frequently or dropped altogether.
2) Testing of flawed equipment is often cancelled or postponed to avoid conditions in the design margins as too risky. A case in point would be deferral of an overspeed test on a rotor suspected of having bore cracks.
3) Interlock tests lend themselves to procrastination, particularly if scheduled at the end of an overhaul when real pressure is applied to return a unit to service. Typically, a commitment is made to conduct the tests at the first opportunity…this is often neglected or forgotten.
· Header Survey
Often, extensive repairs are made to boiler components (tubing, structure, ductwork etc.) without asking the question “Are the components where they are supposed to be?” A boiler is a huge accumulation of parts, continually working and under stresses induced by flows, pressures, and thermal conditions. It would be unrealistic to expect no change in position of major components over time. This position change, if it occurs, can cause or be a factor in several lesser failures. For example, assume a one-foot settlement of an upper waterwall header over time. Several tie-backs would exceed their limits, wall panels would tend to bow out, lower and intermediate headers would bottom out, and the resulting increase in combined stress would aggravate normal tube failure mechanisms, resulting in frequent tube failures.
Every five years or so it would be prudent to conduct a cold survey of all boiler headers to establish their position (particularly their elevation) and compare it to the as-built general arrangement drawings.
· Pipe Hanger Survey
Main steam leads, reheat lines (hot & cold), feedwater pipes, and in fact all major plant pipes are supported by hanger systems. These, in addition to handling the dead weight of the pipe, provide enough flexibility to allow for thermal expansion and flow related vibration without increasing the combined stresses at any point beyond design limits. Generally, spring-loaded (constant support) hangers are used, individual hangers having a scale setting to meter the exact force borne as it’s share of the total load.
Over time, many factors (thermal cycling, structural shifting, loss of hanger spring temper, creep etc.) may result in distortion sufficient to overload some hangers and unload others. Periodic surveys of the hanger settings, preferably before (hot) and during (cold) each overhaul, would detect this distortion. There are two distinct advantages obtained from the surveys:
1.&nbs